This project uses Synthea™ to generate realistic synthetic patient data for medical notes.
You can specify the Synthea CSV directory directly in your config file. Add the following line to your config.yaml:
Example config.yaml:
Then generate notes using:
This project requires Synthea™, an open-source synthetic patient generator, as an external dependency. You must clone and build Synthea yourself before using mednotegen.
To set up Synthea:
- Clone Synthea
git clone https://github.com/synthetichealth/synthea.git
- Build the Synthea JAR
cd synthea ./gradlew build check test cp build/libs/synthea-with-dependencies.jar . cd ..Ensure synthea-with-dependencies.jar is in the synthea/ directory at the root of your project.
You can customize patient generation and report output using a config.yaml file. Example options:
- count: Number of reports to generate
- output_dir: Directory to save generated PDFs
- use_llm: If true, uses OpenAI LLM for report text
- seed: Random seed for reproducibility
- reference_date: Reference date for age calculations (YYYYMMDD)
- clinician_seed: Optional, separate seed for clinician assignment
- gender: Gender filter for patients (male, female, or any)
- min_age, max_age: Age range for patients
- state: US state for Synthea simulation
- modules: Synthea disease modules to enable
- local_config: Path to a custom Synthea config file
- local_modules: Directory for custom Synthea modules
For an up-to-date and complete list of available modules, see the official Synthea modules directory.
If you see errors about missing patients.csv, medications.csv, or conditions.csv, make sure you have generated Synthea data and that the path you provide (via synthea_csv_dir, CLI, or config) points to the correct directory containing those files.
If you installed mednotegen via pip, the default location is inside the package directory. For custom or system-wide Synthea runs, always specify the output CSV directory explicitly.
- No CSV files generated:
- Make sure you edited the correct synthea.properties and used the -c flag when running Synthea.
- Ensure exporter.csv.export = true is set and not overridden elsewhere in the file.
- FileNotFoundError for CSVs:
- Confirm the CSV files exist in the path specified by synthea_csv_dir or in the expected package location.
- ValueError: No patients found matching the specified filters:
- Check your age/gender filters in config.yaml. Try relaxing them if you have too few patients.
Edit src/main/resources/synthea.properties in your Synthea directory:
(Ensure any exporter.csv.export = false lines are removed or commented out.)
From your Synthea directory, clean any old output and generate new data:
- The -p 1000 flag generates 1000 patients.
- After running, check for CSV files in output/csv/.
See README_SYNTHEA_NOTICE.md and LICENSE-APACHE-2.0 for license and attribution requirements.