A web-based system for bulk uploading, deduplicating, and tracking institutional awards from CSV data. It replaced a messy manual process with a clean, auditable workflow.
At the institution, awards data was scattered across dozens of CSV files delivered at irregular intervals. Each file came from a different source, with different column orders, different date formats, and sometimes different column names entirely. Administrators had to open each file, visually scan for structure, and manually copy records into a central spreadsheet. One misaligned column could mean hundreds of awards assigned to the wrong people.
Duplicates were the bigger headache. The same award would appear in multiple files, sometimes with slightly different formatting, sometimes identical. The team tracked "already processed" records in a separate spreadsheet, but it was never fully up to date. When auditors asked for a clean list of all awards issued, nobody could produce one with confidence.
The volume made manual processing unsustainable. What started as a manageable trickle of award files had grown into regular bulk deliveries containing thousands of records. The team needed a system that could ingest messy data and detect problems on its own, then produce audit-ready output without requiring anyone to become a spreadsheet expert.
Every data source delivered files with different column orders, naming conventions, and date formats. Automated processing was impossible without manual cleanup first.
The same awards appeared in multiple files. Tracking what had already been processed lived in a spreadsheet that was perpetually out of date.
Copy-pasting thousands of records by hand led to misaligned columns, dropped rows, and data that couldn't be trusted when auditors came calling.
There was no centralized database or processing log. When asked "how many awards were issued last quarter?", the answer required hours of manual reconciliation.
Built a file upload pipeline that accepts ZIP archives containing multiple CSV files. The system extracts every CSV, validates file integrity, and queues each one for processing. Administrators can upload an entire delivery in one action instead of handling files individually.
Implemented automatic column mapping that identifies AwardId, UserName, and IssueDate regardless of column order or naming variations. The system reads header rows, matches against known patterns, and flags files where it can't confidently identify required fields. No more manual column alignment.
Every incoming record is checked against the existing database before insertion. Duplicates are flagged but never silently dropped: administrators see exactly which records were skipped and why, with enough context to verify each decision.
After every batch upload, the system generates a detailed summary report: total records processed, new records inserted, duplicates flagged, errors encountered. Session management with cookie-based "Remember Me" authentication keeps administrators logged in across sessions without re-entering credentials.
Let's talk about how automated processing could clean up your data workflow.
Start a ConversationUpload entire ZIP archives containing multiple CSV files. The system extracts, validates, and queues every file automatically. No more processing files one at a time.
Identifies AwardId, UserName, and IssueDate columns regardless of naming or order. Files with unrecognizable structures are flagged for manual review.
Every record is cross-checked against existing data before insertion. Near-duplicates with whitespace or formatting differences are caught and flagged transparently.
After every batch, a detailed report breaks down total records, new inserts, duplicates skipped, and errors encountered. Every decision the system made is visible and verifiable.
Secure login with session management and cookie-based "Remember Me" functionality. Administrators stay authenticated across sessions without repeated logins.
Every row is validated against expected formats before processing. Malformed dates, missing required fields, and structural anomalies are caught and reported, not silently ignored.
The main upload screen where administrators drag and drop ZIP archives. Progress indicators show extraction, validation, and processing status in real time as each CSV file is handled.
Each flagged duplicate is shown next to the existing record it matched. Administrators can verify the decision and override when needed.
The post-processing summary lists records processed, new inserts, duplicates skipped, and validation errors. It's a complete audit trail for every batch upload.
We used to dread award season. Every delivery meant hours of sorting through CSVs, checking for duplicates by hand, and praying we didn't miss anything. Now we upload the ZIP, review the summary, and move on. The system catches things we never would have found manually.
The first instinct was to auto-correct formatting issues and silently skip duplicates. Administrators pushed back hard. They needed to see every decision the system made, which records were inserted, which were flagged, and why. Trust in automated processing comes from transparency, not from hiding the messy parts. The summary reports became the most valued feature.
Parsing a well-formed CSV is straightforward. Figuring out which column is which when every source uses different headers ("award_id" vs "AwardID" vs "Award Number" vs just "ID") required building a fuzzy matching system that could handle real-world naming chaos. Edge cases in column detection consumed more development time than the entire upload pipeline.
Before the system, administrators processed files as they arrived, one at a time, interrupting other work. Once batch processing was reliable, the team shifted to a scheduled workflow: collect deliveries throughout the week, upload everything Friday afternoon, review the summary Monday morning. The tool didn't just speed up a task; it changed how the team organized their time.
Every project starts with a conversation. Tell us about your data processing challenges and let's figure out what an automated awards workflow could look like for you.
No pitch. No pressure. Just a conversation about what might work.