This project demonstrates an LLM-powered ETL process in a .NET 8 console application.
Traditional ETL struggles with changing CSV column names, but this approach uses an LLM to infer column mappings at runtime.
By sampling the first few rows of a CSV, the application queries a lightweight llama3.2-3B model to identify which CSV columns correspond to a fixed Customer schema.
The inferred mapping is then used to transform the entire CSV file.
The process converts dynamic CSV data into a structured Customer record with Id, Name, Email, SignupDate, and IsActive fields.
The `Convert` and `Lookup` methods handle data normalization and type conversions, including flexible date parsing.
The application streams the full CSV file, processes each row using the LLM-generated mapping, and outputs line-delimited JSON.
This method eliminates the need for fragile configuration files and handles messy real-world data gracefully.
The entire solution is built with approximately 150 lines of C# code, leveraging Semantic Kernel and Ollama.
This showcases the power of LLMs for dynamic data integration, offering a robust and efficient alternative to traditional ETL methods.
dev.to
dev.to
Create attached notes ...
