Generative AI for Automotive Critical Scenario Generation

2 hours ago 2

A Copilot and Simcenter Prescan adventure

Press enter or click to view image in full size

Figure 1: My wishes and description in words are transformed by copilot into a code and a full-fledged simulation. Maybe with a little more detail and flavor…

Another day, another mind-numbing routine as an automotive engineer. We’re stuck building yet another simulation environment for a scenario that feels all too familiar. Place the ten-year-old Austrian pine here, rotate the gray car to exactly 32.5 degrees yaw over there — rinse and repeat. We hack together temporary codebases just to get it done, knowing they’ll be tossed aside because none of us are developers and the APIs keep changing anyway. It’s a tedious cycle of manual work and wasted effort, and honestly, the boredom is almost worse than the repetition.

But what if that monotony vanished? Instead of wrestling with yaw angles and tree placements, imagine typing a simple description — ‘a gray car approaches an intersection with a pedestrian crossing under cloudy skies’ — and watching it transform into a fully configured simulation. No more disposable codebases, no more API headaches. Generative AI, paired with Simcenter Prescan, turns tedious setup into an instant, creative process. Suddenly, we’re not just engineers — we’re storytellers shaping scenarios with words, and letting AI handle the heavy lifting.

In this demo, we explore how natural language and AI-driven automation can transform the way engineers design and validate critical driving situations. By simply writing a short description, an AI assistant generates a Python script that configures and runs the scenario in Prescan, bridging the gap between conceptual thinking and executable simulation. This collaboration between Copilot and Simcenter Prescan demonstrates a new era of efficiency and creativity in automotive testing — where ideas move from text to test track in seconds.

Video 1: A simple scenario using two Claude Sonnet 4.5 agents cascaded together with a Prescan mcp server and the Prescan Simulator to turn a simple scenario in markdown to a full automotive simulation.

SimCenter Prescan is a high-fidelity simulator for automotive scenarios. Prescan provides accurate, physics-based data essential for testing and validating autonomous systems.
GitHub Copilot is a developer tool powered by large language models. It assists with code generation, debugging, and answering code-related queries. For this demo, we use Copilot within Visual Studio Code (VSCode), one of its most effective environments.

We tried also with different clients like cline and kilocode, that is because all these clients can use Model Context Protocol Servers as the one we created.

In many scenarios that we tried developing we had to fix small parts of code. These were not python syntax errors or wrong API usage but wrong constants entered.

Why this matters

This demo illustrates how AI can accelerate simulation workflows by translating natural language into executable code. It reduces manual effort, minimizes errors, and opens the door to more intuitive scenario design.

The key differences are:

Reproducibility: scripts allow reproduction of the environment accurately.
Controllability: We know where our objects are placed and where they are going. Object placement is exact. A pedestrian isn’t “roughly” behind a car — it’s exactly 0.5 meters behind, centered on the sidewalk.
Dynamic precision: While many tools generate static geometries, few define actor logic. Leveraging Open Scenario, prescan dmapi enables robust, scriptable dynamic behavior.
Maintainability: If Prescan updates tomorrow, your experiment won’t break. The API provides clear change guidelines, and the code remains readable — not locked in a proprietary format.
Future-Proofing: When today’s AI agent becomes obsolete, your experiment — and its logic — remains intact. We hold the code and the prompt for reproducing it. The result will be the same — The user’s intent. Unlike black-box renderers like nvidia cosmos or genie 3 this pipeline preserves your intent.
Cost Efficiency: If it was to re-use our prompt every time we wanted to render our experiment or do a small logic change or even when there is an update on the pipeline, we would need to run a big llm or reasoning model very frequently. Frequent use of AI models means using more computational power. This use costs and it costs a lot.
Time efficiency: New sensor? New vehicle? No need to start from scratch. We can just update the script to include the new module.
Explainability: We cannot explain the blackbox of an llm. However, we can see what our agent did at every step and what it didn’t. Even when running in the background creating hundreds of scenarios for us, we can stop it and see what it did.

Most costs stem from the API first notion, but they very much apply to when developing such an AI agent.

So let’s dive in and explore how AI can reshape the way we build and test automotive simulations!

Notes

— dmapi = data model api
— Of course there are drawbacks of Prescan itself as a simulator, but every simulator has drawbacks or yet to implement features.
— Prescan = Simcenter Prescan

Creating an MCP server for Prescan

Creating an MCP server will help us launching, configuring, viewing and developing prescan experiments. So if we would like to start prescan, we can now start it (even remotely) with a prompt:

We can simply launch the prescan manager even if that means running it remotely.

We could also renew the installations or re-parse the documentation and read the newest documentation of prescan in case there were changes done lately. Reading the documentation is done via a search resource provided by the mcp giving examples on how to use the python libraries of prescan. Generally the mcp servers helps the client read all resources offered by prescan.

Figure 2: The prescan mcp server lies in the middle of the Simcenter Prescan api and simulator. This lets the Agent use prescan as a tool wherever prescan is deployed. The left bottom side shows the user interacting with the chat client. The chat client interacts both with the llm and the mcp server. For more details visit the model context protocol page

Generating the network

he information that is shared between Prescan and the client is limited. This confines the agent capabilities for network generation. If it is a simple network that users want to generate then our developer agent can handle it easily. For example the following prompt:

Reading the prescan documentation, create a script that creates a curvy 1km road of two lanes at each side that starts from south tilts to the north west by 30 degrees. Each lane has a width of 3.5 m.
Run this script using the mcp server and re-iterate till the script runs successfully. Do not use try-catch

The result is an s-shaped curvy road. We didn’t determine how many curves so the network assumed two. These ambiguity types can lead to different network. The agent makes assumptions very easily to fill missing data. This means that describing networks and generating them cannot easily scale. Generating a network accurately needs very precise definitions that don’t have any ambiguities.

If we zoom out a bit we can see that this is not solely an AI agent problem. It occurs already in the automotive industry. This is why there is an EU, we came with the Euro NCAP safety assesment program. See more in this link https://www.euroncap.com/en. It is an ongoing effort to standarize automotive scenarios.

Developing a scenarios description though can have ambiguities. This is exactly the power of this framework. We can ask an AI assistant (or multiple) looking at these documents if there are any unclear specification. We could also go a step further and generate these scenarios from their documents using the demo’s proposed solution. More clear scenario, makes the repetition possible and fortifies the scenario itself.

Here is a scenario from Euro NCAP that we tried:

Video 2: Parsing with docling a document of euro ncap scenario to markdown and then using the same pipeline to produce a scenario. We see that we get an automotive scenario, but calibration is still needed. No sensors were added to any vehicle.

Press enter or click to view image in full size

Figure 3: The ncap behavior presented in video 2. We see that the script outputs results from the position,speed, acceleration of each vehicle together with a simulation environment.

The pipeline we used here was a little different, since we introduced the parsing from pdf document with specifications, we needed an agent to write down a scenario out of it. So we switched to the pipeline below:

Press enter or click to view image in full size

Figure 4: Starting from a euro ncap pdf our system proceeds various steps to generate the simulation. In the core of the system is always the user who has full control of each individual step. So the euro ncap pdf on the left is parsed by docling in an agent friendly format. This is provided to an agent which creates scenarios, but gives only tabular data with precise behavior definition. Then a narration agent is needed to extract a scenario and help the user there. After that an AI architect agent (discussed below) is creating tasks for another developer agent to do. After the developer agent has finished executing, a complete pythons script is created with dmapi commands to simcenter prescan that create a simulation. This simulation can be augmented by the user later, since it is written in plain python code.

We first transformed the pdf in text using IBM’s tool: docling. The amazing part about docling is that it can be used from an API or an mcp server and can translate both the diagrams and the text of the pdf into markdown or simple text. It can even prepare these files to be used with RAG. Here is a simple python snippet we used to parse images and text at the same time:

import json

from docling.datamodel.accelerator_options import AcceleratorDevice, AcceleratorOptions
from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import DocumentConverter, PdfFormatOption
from docling.datamodel.pipeline_options import smolvlm_picture_description
from docling_core.types.doc.document import PictureDescriptionData
# The source document
DOC_SOURCE = "./ncap.pdf"

def constructOptions() -> PdfPipelineOptions:
accelerator_options = AcceleratorOptions(
num_threads=8, device=AcceleratorDevice.CPU
)

pipeline_options = PdfPipelineOptions()
pipeline_options.accelerator_options = accelerator_options
pipeline_options.do_table_structure = False
pipeline_options.table_structure_options.do_cell_matching = False
pipeline_options.do_picture_description = True
pipeline_options.picture_description_options = (
smolvlm_picture_description # <-- the model choice
)
pipeline_options.picture_description_options.prompt = (
"""Analyze the provided image representing geometry requirements of a technical specification document and provide a structured description in up to five concise, factual sentences. Maintain technical precision while avoiding subjective interpretations, focusing strictly on observable elements in the specification layout."""
)
pipeline_options.images_scale = 1.0
pipeline_options.generate_picture_images = False

return pipeline_options

def main():
pipeline_options = constructOptions()
converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(
pipeline_options=pipeline_options,
)
}
)
doc = converter.convert(DOC_SOURCE).document
with open("ncap_parsed.md", "w", encoding="utf-8") as f:
f.write(doc.export_to_markdown())
with open("ncap_parsed.json", "w", encoding="utf-8") as f:
f.write(json.dumps(doc.export_to_dict(), indent=2))

if __name__ == "__main__":
main()

After we prepared our files, we can provide a prompt to a software-architect-agent to create a task description for our developer agent. This is the prompt we used:

Based on the text that will be provided in euro-ncap-protocol-crash-avoidance-frontal-collisions-v10.md, generate a comprehensive automotive simulation scenario description by systematically extracting and organizing all relevant technical, environmental, and operational parameters.

Include:

1. **Scenario Context**:

— Purpose (e.g., safety testing, fuel efficiency analysis, autonomous vehicle validation)

— Geographic/terrain details (e.g., urban highway, mountainous region, weather conditions) — Traffic conditions (density, vehicle types, pedestrian interactions)

2. **Vehicle Specifications**:

— Model parameters (type, powertrain type)

— Sensor/ADAS configuration (camera FOV, lidar range, radar placement)

3. **Behavioral Parameters**:

— Positions and orientation

— Trajectories and speed

— Traffic rules and exceptions (e.g. speed limits)

4. **Simulation Requirements**:

— Time duration and granularity (simulation runtime, time-step precision)

— Success/failure criteria (collision thresholds, efficiency benchmarks)

— Required output metrics (e.g., energy consumption, trajectory logs, system latency) Present parameters in a structured table format with units of measurement, value ranges. Propose justified default values for missing parameters.

The Euro NCAP leaves some parameters to be set by the tester. Of course we cannot have an accurate temperature or wind flow in real life. However we can have ranges. In case of a simulation we could always simulate the most extreme scenarios. The prompt above gave us a list of tables with no narrative. The automotive scenario can be extracted from the tables without narrative text. From the user’s view we need an additional narrative to accompany those tables to better know what we are looking at. So we further processed those tables with the same llm agent to provide a story first. The following answer along with tables of parameters which followed later was used in the same pipeline as the first video to generate out second video above.

SCENARIO NARRATIVE
Contextual Background
The CCRb-60 scenario simulates a critical real-world highway driving condition where a vehicle under test (VUT) is traveling at a constant 60 km/h on a dry, level highway when the lead vehicle (Global Vehicle Target – GVT) suddenly initiates emergency braking. This scenario is representative of sudden traffic slowdowns, emergency stops, or collision avoidance maneuvers by preceding vehicles, which account for a significant portion of rear-end collisions in urban and highway environments. The test is conducted on a standardized test track with a smooth, friction-optimized asphalt surface (Peak Braking Coefficient ≥0.9) under daylight conditions with homogenous illumination exceeding 2,000 lux. The VUT's Autonomous Emergency Braking (AEB) system must detect the braking target vehicle, predict a collision trajectory, and autonomously apply braking to mitigate impact speed before contact occurs.
Actors & Vehicle Specifications
Two primary actors participate in this scenario: (1) the Vehicle Under Test (VUT) – a passenger car equipped with an AEB collision mitigation system, initially traveling at 60 km/h in a stable, straight-line path along the centerline of a 3.5 m wide driving lane; and (2) the Global Vehicle Target (GVT) – a standardized reference vehicle representing a typical mid-size sedan, initially moving at an identical speed of 60 km/h directly ahead of the VUT. The initial longitudinal separation between the two vehicles is maintained at a one-second time headway (approximately 16.7 m at 60 km/h), simulating typical highway following distance. Both vehicles are aligned on the lane centerline with zero lateral offset. At the test commencement (T0), defined as 4 seconds before predicted impact under no-intervention conditions, the GVT begins a programmed deceleration profile at a constant rate of -4 m/s² (approximately 0.41 g), representing a firm but not maximum braking event. This deceleration rate is precisely controlled and reached within 1.0 second, then maintained within ±0.5 km/h of the reference speed profile until the GVT velocity drops to 2 km/h or lower.
Critical Behavioral Constraints & Test Execution
The test is valid only when strict boundary conditions are maintained throughout the interval from T0 to the AEB activation time (TAEB). The VUT speed must remain within +1.0 km/h of the nominal 60 km/h (no tolerance for lower speeds, allowing up to 61 km/h), the GVT speed must stay within ±1.0 km/h of its prescribed deceleration profile, lateral deviation must not exceed ±0.05 m from the centerline for both vehicles, and relative heading angles must remain within ±1.0°/s. The test driver operates the VUT with no active steering or throttle inputs; alternatively, robotic control systems can modulate acceleration and steering to maintain initial conditions until the AEB system naturally intervenes. If the VUT's AEB system successfully activates (indicated by an acceleration signal crossing the -1 m/s² threshold, confirmed by the last data point below -3 m/s²), continuous longitudinal deceleration is recorded until either (a) the VUT's speed equals the GVT's speed (closing velocity eliminated), (b) contact occurs between the vehicles, or (c) both vehicles come to rest. The critical success metric for this Standard Range test case is the relative impact speed (Vrel_impact) – the difference between the VUT's velocity and the GVT's velocity at the moment of collision. A Vrel_impact of 0 km/h (full avoidance) scores a Green (1.00 point), while Vrel_impact >0 to ≤10 km/h scores Yellow (0.75), >10 to ≤20 km/h scores Orange (0.50), and so forth.

IMPLEMENTATION PARAMETER TABLES
Table 1:
...

Missing Feedback

Figure 5: Currently a text based agent develops the code and takes all decisions. It creates the python script and runs it to see if it used the apis correctly. The resulting script creates a scenario scene and its dynamics. However there is no feedback coming from the scene to the agent allowing for incorrect assumptions or constants, leading to flawed outcomes.

In simulation environments, the AI agent perceives the scene only through the code provided. The process of generating an experiment from this code is essentially a black box to the agent. While reusable examples serve as helpful building blocks, they do not guarantee a valid or realistic final result.

For instance, placing cars inside buildings is clearly an error that a GUI would prevent. However, when building shapes and orientations are unknown, the client is left to resolve such issues manually.

Press enter or click to view image in full size

Figure 6: The AI agent placed a car overlaping with a building, a building overlaping with the road and pedestrians next to trees in the middle of the road. The agent in this example miscalculated a spline equation and placed all these objects wrong.

There are two approaches to solve this.

Meta-Functions Using Prescan DMAPI One solution is to create meta-functions that utilize the Prescan DMAPI to place objects (e.g., buildings, pedestrians, vehicles). These functions can return information about object placement and detect overlaps. They could also include snapping mechanisms or logic to place objects in empty areas. This approach aligns well with procedural generation and gives flexibility to the user.
Providing Scene Graph Information to the AI Agent A more robust solution is to give the AI agent direct access to the scene graph. This eliminates the need to translate graph data into text, reducing noise and ambiguity. It also opens the door to multimodal agents capable of processing both textual and graphical inputs. This is a promising direction and one that AI engineers should actively pursue.

— A procedural generation library using the Prescan dmapi can be found in this github repo

Prompt Layering

To improve prompt accuracy for our AI agent developer, we introduced a second agent with the role of a software architect. This architect reviews the prompt, identifies ambiguities, and suggests reasonable defaults for unclear parameters.

This creates a two-agent architecture where the developer agent receives a refined, noise-free prompt. As shown below, the architect can even generate a new prompt that guides the developer on how to write the code.

The concept of layered agents is inspired by the BMAD method, which uses multiple AI agents as stakeholders in an agile framework. While more agents could be added, two are sufficient for this demonstration.

Press enter or click to view image in full size

Figure 7: The user starts with a prompt. A software architect is reviewing and refining that prompt to give it to an AI developer. With the user’s consent reviewing the AI developer gets the prompt defining its task and creates a python script that is the outcome of the initial prompt and solving some ambiguities. This figure was generated by Microsoft copilot.

Since the architecture is already set, we could host the agents in the same client. To do that we would need to make use of Agent2Agent protocol. Of course for the sake of demoing, the simple single agent was enough.

Conclusion

AI-assisted simulation generation is still an evolving field, with no one-size-fits-all solution. Each approach — whether rule-based, prompt-driven, or graph-enhanced — offers unique advantages. The real potential lies in combining these strengths into a hybrid system that balances flexibility, precision, and scalability. In this demo, we explored a foundational method: using an AI agent to generate scenario scripts from natural language. This is just the beginning. By integrating scene graph feedback, procedural geometry generation, layered prompting, agent swarms and dynamic behavior scripting, we can move toward a more complete and resilient workflow.

Ultimately, the goal is clear: to empower engineers with tools that are transparent, reproducible, and future-proof — where the user’s intent drives the simulation, not the limitations of the assistant.

Thank you for reading this document so far. I am looking forward for your feedback and do not hesitate to reach out as this sector is still lacking of our amazing ideas. Let’s make autonomous vehicles safe!

Acknowledgement

A heartfelt thanks to the Simcenter Prescan team for providing the powerful APIs that made this work possible. Special appreciation goes to Ahmet Bilgili for his ideas and vision and to for his close guidance and use cases that shaped this agentic workflow. I’d like to thank from IBM’s Docling team for his support in integrating Docling effectively, enabling NCAP scenarios to flow seamlessly through our pipeline. As well as the people that have seen me developing it and listened me when I forgot to stop talking about how important this is: Thank you!

Read Entire Article