Automating Simulations
Setting up a simulation for materials science and engineering requires not only to understand the employed simulation technique but also proficient knowledge about the file format used to configure the simulation run. Large Language Models (LLMs) can provide support for that.
As part of the FULL-MAP project, co-researchers to support the simulation setup will be developed. The considered simulation tools are DAMASK and LAMMPS.
Questions
- aside from fine-tuning a code LLM, what could be other techniques:
- RAG - less expensive.
- if fine-tuning then dataset generation?
- we have the intended outputs from the simulation software examples
- we dont have the natural language description of the scripts for example: “write an lammps script to compute the melting point of Copper” - stuff like this is missing // prompts (problem description) to output (input scripts) mapping is missing
- we also do not have any dataset available for DAMASK - we have to create that
- benchmarking?
- performance of vanilla llm vs llm + RAG vs llm vs fine-tuned llm from custom dataset
Technical Aspects
MCP
ollama
Training strategies
- Create input, run simulation: expensive
- For LAMMPS: use existing checker
- For DAMASK: YAML Lint (Python tool)
Further links
- BentoML
- FINALES2
- https://github.com/huggingface/llm-ls
- https://github.com/rosarp/llm-lsp
- https://opencode.ai
- https://crates.io/crates/llm-lsp
- https://github.com/chenghao-wu/mcp_lammps
- https://github.com/SilasMarvin/lsp-ai
- AlloyBert
- AlloyGPT
- https://deepwiki.com/damask-multiphysics/DAMASK
- https://agentskills.so
- https://blog.fsck.com