OER Resource Manager

We built a batch processing pipeline that pulls Open Educational Resources from over 50 repositories and ranks them by quality. Faculty can find free textbook replacements without spending a weekend searching.

0+ Sources Indexed
AI Quality Scoring
0% Cost Reduction
Screenshot coming soon

What was broken.

Faculty across the institution wanted to adopt Open Educational Resources to cut student costs, but the ecosystem was a mess. Dozens of OER repositories existed (OpenStax, OER Commons, MERLOT, MIT OpenCourseWare, and others), each with different interfaces, metadata schemas, and quality levels. Finding a replacement for a $200 textbook meant manually searching each platform one at a time.

Even when faculty found OER materials, there was no good way to judge quality. A resource might look promising from its title but turn out to be a poorly formatted PDF with outdated content. Without any shared evaluation system, every faculty member started from scratch, repeating the same frustrating search their colleagues had already given up on.

The institution needed a way to pull resources from every major OER repository, normalize the metadata into one searchable index, and surface the best materials automatically. Faculty should be teaching, not searching.

Fragmented Repositories

Dozens of OER sources with different interfaces and metadata formats. No single place to search them all.

No Quality Signal

Faculty couldn't distinguish high-quality OER from outdated or poorly structured materials without downloading and reviewing each one.

Textbook Cost Burden

Students were spending hundreds per semester on required textbooks. Free alternatives existed but were nearly impossible to find at scale.

Manual Search Fatigue

Each faculty member spent hours repeating the same searches across the same repositories. None of that work could be reused.

How we solved it.

01

Multi-Source Harvesting Pipeline

We built a Python batch processing system with modular connectors for 50+ OER repositories. Each connector handles a source's unique API or scraping requirements and pulls resource metadata, download links, and PDFs into a unified staging area.

Connectors for OpenStax, OER Commons, MERLOT, MIT OCW, Open Textbook Library, BC Campus, and dozens more. Each has its own authentication, pagination, and rate-limiting logic.
02

Schema Mapping & Metadata Normalization

Every source uses different metadata formats. Some follow Dublin Core, others use custom schemas, and many have inconsistent or missing fields. We built a schema mapping layer that normalizes everything into a consistent structure: subject, format, license, author, publication date, and content type.

03

AI-Powered Quality Scoring

I integrated OpenAI to analyze downloaded PDFs and resource descriptions. The system scores each resource on content depth, organization, recency, citation density, and pedagogical value. Faculty can trust the top-ranked results without reviewing every PDF themselves.

04

PHP Web Interface & CLI Tools

We built a PHP web interface backed by MySQL where faculty can browse, search, and filter indexed resources by subject, format, quality score, and license type. A companion CLI tool handles batch harvesting, re-indexing, and re-scoring on a schedule.

Technologies Used

Python OpenAI API PHP MySQL PDF Processing Schema Mapping Batch Processing Web Interface

Struggling with textbook affordability?

I can walk you through how an OER pipeline could help your faculty find free course materials without the manual search.

Start a Conversation

What it actually does.

Multi-Source Harvesting

Automated connectors pull resources from 50+ OER repositories (OpenStax, MERLOT, OER Commons, MIT OCW, and others) into a single searchable index.

PDF-First Content Ranking

Downloads and analyzes the actual PDF content, not just metadata, to rank resources by depth, structure, and pedagogical value.

AI Quality Scoring

OpenAI evaluates each resource on content depth, organization, recency, and pedagogical design. The result is a composite quality score faculty can actually trust.

Schema Mapping

Normalizes metadata from Dublin Core, custom APIs, and inconsistent fields into a unified schema so cross-repository search and comparison actually works.

Search & Filter Interface

PHP web interface lets faculty browse by subject, format, license, and quality score. Instant search across thousands of indexed resources with faceted filtering.

Textbook Replacements

Recommends OER alternatives matched to specific textbooks and courses, with quality comparisons and adoption guidance for faculty.

See it in action.

The numbers speak.

0+
Sources Indexed
From OpenStax to MIT OCW and beyond
0%
Cost Reduction
Average student textbook spending drop
0s
Resources Cataloged
Searchable, scored, and ready for adoption
AI
Quality Scoring
OpenAI-powered evaluation of every resource
Before this tool, I spent an entire weekend searching five different OER sites for one psychology textbook replacement. Now I type in my subject, sort by quality score, and have three solid options in under a minute. My students saved over $150 each this semester.
JR
Faculty Member Department of Behavioral Sciences

What I learned.

01

Metadata is the bottleneck, not content

There's no shortage of free educational material online. The real problem is that every repository describes its resources differently. Building the schema mapping layer took longer than building the harvesting connectors, but it's what made the whole system usable. Without normalized metadata, search across sources is meaningless.

02

PDF analysis changes faculty trust

Early versions only scored resources based on their metadata descriptions. Faculty didn't trust it because descriptions are often vague or overly optimistic. When I added PDF-first analysis that actually reads the content, quality scores became meaningful and adoption followed. Faculty need to know the system looked at the same thing they would.

03

Batch processing needs visibility

The initial CLI-only harvesting tool ran silently for hours. Faculty and administrators had no idea whether it was working, stuck, or finished. Adding a progress dashboard with per-source status and error counts turned a black box into something the team could monitor and trust to run on schedule.

Ready to make textbooks
affordable?

Tell us about your OER goals. We'd like to hear how a centralized resource pipeline could save your faculty time and your students money.

No pitch. No pressure. Just a conversation about what might work.