DEV Community

Unlock the Power of LLM-Driven ETL: Transform Variable CSV to Clean JSON with C#, Semantic Kernel & Llama 3.2-3B

This project demonstrates an LLM-powered ETL process in a .NET 8 console application. Traditional ETL struggles with changing CSV column names, but this approach uses an LLM to infer column mappings at runtime. By sampling the first few rows of a CSV, the application queries a lightweight llama3.2-3B model to identify which CSV columns correspond to a fixed Customer schema. The inferred mapping is then used to transform the entire CSV file. The process converts dynamic CSV data into a structured Customer record with Id, Name, Email, SignupDate, and IsActive fields. The `Convert` and `Lookup` methods handle data normalization and type conversions, including flexible date parsing. The application streams the full CSV file, processes each row using the LLM-generated mapping, and outputs line-delimited JSON. This method eliminates the need for fragile configuration files and handles messy real-world data gracefully. The entire solution is built with approximately 150 lines of C# code, leveraging Semantic Kernel and Ollama. This showcases the power of LLMs for dynamic data integration, offering a robust and efficient alternative to traditional ETL methods.
favicon
dev.to
dev.to
Image for the article: Unlock the Power of LLM-Driven ETL: Transform Variable CSV to Clean JSON with C#, Semantic Kernel & Llama 3.2-3B
Create attached notes ...