The Challenge

What was broken.

The science faculty had dozens of active researchers across multiple departments, each with their own specializations, ongoing projects, and openings for student collaborators. But students had almost no way to discover who was doing what. Faculty profiles lived on scattered departmental pages, some outdated, some missing research interests entirely, and there was no centralized way to search or browse by topic. A student interested in computational biology had no idea that three professors across two departments were actively publishing in that space.

The informal process wasn't much better. Students relied on word-of-mouth, asked classmates who they'd worked with, or cold-emailed professors based on a course they'd taken, hoping for a match. Advisors tried to broker introductions, but they couldn't keep track of every faculty member's current research focus across every department. Well-connected students found mentors easily. Everyone else, especially those new to the program or from underrepresented backgrounds, was left navigating alone.

The department knew this was costing them. Research participation numbers were flat despite growing enrollment, faculty were getting irrelevant requests from students who didn't understand their work, and promising students were slipping through the cracks. What they needed wasn't another static directory. They needed a system that could understand what a student was interested in and match them with the right mentor, even if the student didn't know that mentor existed.

Hidden Research Activity

Students didn't know which professors did what research. Faculty profiles were incomplete, outdated, or buried across department websites with no unified search.

Word-of-Mouth Gatekeeping

Research interest matching depended on personal connections and hallway conversations. Well-connected students found mentors; others were left on their own.

Scattered Faculty Profiles

Research interests, publications, and lab openings were fragmented across department websites with no consistent format, so cross-department discovery was impossible.

Scale Without Visibility

Students in large programs couldn't find mentors. The bigger the faculty, the harder it was to find who matched their interests without a real way to search.

The Approach

How we solved it.

Scraped and Indexed Faculty Profiles

We built a web scraping pipeline to crawl departmental faculty pages and pull out names, titles, research interests, lab affiliations, publication keywords, and profile photos. The scraper handled inconsistent page structures across departments and normalized everything into a unified faculty index in MySQL with full-text search and structured research interest tags.

Designed the Multi-Algorithm Matching Engine

We developed three matching strategies: exact keyword matching for precise term hits, semantic similarity scoring using NLP to catch related concepts (so "machine learning" matches "neural networks"), and departmental alignment to factor in administrative proximity. Each algorithm produces a weighted score, and the final ranking blends all three to find the most relevant faculty matches.

Built the Student Input Interface

We created a guided input flow where students describe their research interests in natural language, select from predefined topic tags, and optionally specify department preferences. The interface works for students who know exactly what they want ("computational genomics") and those exploring broadly ("something with data and biology"). Both feed into the matching engine the same way.

Delivered Ranked Results with Faculty Detail Pages

The results page ranks matched faculty by composite score, showing match percentage, which algorithms contributed, and a preview of each professor's research focus. Each result links to a full faculty detail page with photo, bio, publications, current projects, and contact information. Students get everything they need to make an informed outreach decision.

Technologies Used

PHP 8.0+ MySQL 8.0+ Bootstrap 5 Web Scraping Semantic Similarity NLP Faculty Photo Integration

Key Features

What it actually does.

Multi-Algorithm Matching

Three matching strategies work in parallel: exact keyword hits, semantic similarity for related concepts, and departmental alignment. These blend into a single composite score that ranks the best-fit faculty mentors.

Faculty Profile Scraping

An automated scraping pipeline crawls departmental pages to extract faculty names, titles, research interests, lab affiliations, and profile photos. It normalizes inconsistent formats into a unified, searchable index.

Research Interest Indexing

Faculty research areas are parsed, tagged, and indexed with full-text search and semantic embeddings. A student searching "machine learning" will also find professors working in "deep learning," "neural networks," and "AI."

Student Preference Input

A guided input flow lets students describe interests in natural language, select from topic tags, and set department preferences. It works whether a student knows exactly what they want or is still exploring.

Ranked Results with Match Scores

Results are ranked by composite match percentage, showing which algorithms contributed to the score. Students see at a glance why each professor was recommended and how strong the alignment is.

Department-Level Filtering

Students can filter matches by department to narrow results, or leave it open to discover cross-departmental researchers they'd never have found through a single department's website.

In Action

See it in action.

Match Results Dashboard

Ranked Faculty Matches

After a student submits their research interests, the results page shows ranked faculty matches with composite scores, breakdowns of which algorithms contributed, and preview cards with each professor's photo, department, and research areas. Everything is sortable and filterable.

Faculty Detail Page

Faculty Profile Deep Dive

Each matched faculty member has a dedicated detail page with their full bio, profile photo, current research projects, publication keywords, lab affiliations, and contact information. Students get everything they need to write a relevant outreach message.

Student Interest Input

Research Interest Input Flow

The guided input interface lets students describe their interests through natural language text, clickable topic tags, and department selectors. All inputs feed into the matching engine to produce personalized results.

Results

The numbers speak.

Matching Algorithms

Exact matching, semantic similarity, and departmental alignment work together to find faculty mentors that keyword search alone would miss

Faculty Indexed

Profiles scraped and normalized from across departmental websites into a unified, searchable index with research interests, photos, and contact details

Match Satisfaction

Students rated their top-matched faculty as relevant to their research interests, validating the multi-algorithm scoring approach

Semantic-Powered

Discovery Engine

NLP-driven similarity scoring catches conceptual overlaps that exact keyword matching misses, connecting students with mentors they wouldn't have found on their own

Insights

What I learned.

Exact Match Gets You Started, Semantics Gets You There

The first version used only keyword matching, and the results were technically correct but painfully narrow. A student searching "artificial intelligence" wouldn't see professors working in "reinforcement learning" or "computer vision." Adding semantic similarity turned the tool from a glorified search bar into something that could actually make connections students wouldn't have thought to look for.

Photos Change Everything About Trust

When I first launched without faculty photos, students treated the results like a database query, scanning names and clicking away. Adding profile photos changed the whole feel. Students started browsing longer, reading full profiles, and actually reaching out. It seems small, but putting a face to a name turned a tool into an introduction.

Scraping Is the Hardest Part Nobody Warns You About

Every department's website had a different structure, different naming conventions, and different levels of completeness. Some pages listed research interests in bullet points, others buried them in paragraph-long bios, and a few had no research section at all. Building a scraper flexible enough to handle all of these edge cases took longer than building the matching engine itself, but without clean data, no algorithm can save you.

Faculty Research Matcher

What was broken.

Hidden Research Activity

Word-of-Mouth Gatekeeping

Scattered Faculty Profiles

Scale Without Visibility

How we solved it.

Scraped and Indexed Faculty Profiles

Designed the Multi-Algorithm Matching Engine

Built the Student Input Interface

Delivered Ranked Results with Faculty Detail Pages

Technologies Used

Still expecting students to find research mentors through word-of-mouth?

What it actually does.

Multi-Algorithm Matching

Faculty Profile Scraping

Research Interest Indexing

Student Preference Input

Ranked Results with Match Scores

Department-Level Filtering

See it in action.

Ranked Faculty Matches

Faculty Profile Deep Dive

Research Interest Input Flow

The numbers speak.

What I learned.

Exact Match Gets You Started, Semantics Gets You There

Photos Change Everything About Trust

Scraping Is the Hardest Part Nobody Warns You About

Want this for
your institution?

Faculty Research Matcher

What was broken.

Hidden Research Activity

Word-of-Mouth Gatekeeping

Scattered Faculty Profiles

Scale Without Visibility

How we solved it.

Scraped and Indexed Faculty Profiles

Designed the Multi-Algorithm Matching Engine

Built the Student Input Interface

Delivered Ranked Results with Faculty Detail Pages

Technologies Used

Still expecting students to find research mentors through word-of-mouth?

What it actually does.

Multi-Algorithm Matching

Faculty Profile Scraping

Research Interest Indexing

Student Preference Input

Ranked Results with Match Scores

Department-Level Filtering

See it in action.

Ranked Faculty Matches

Faculty Profile Deep Dive

Research Interest Input Flow

The numbers speak.

What I learned.

Exact Match Gets You Started, Semantics Gets You There

Photos Change Everything About Trust

Scraping Is the Hardest Part Nobody Warns You About

Related projects.

Library Resource Tool

Geolocator

Want this foryour institution?

Want this for
your institution?