tum-search
A TUM-focused search and knowledge-graph system combining recursive crawling, AI summaries, vector search, graph structure, and live crawl feedback.
A TUM-focused search and knowledge-graph system combining recursive crawling, AI summaries, vector search, graph structure, and live crawl feedback.
tum-search
Why this article exists
Search is a good test of whether a system can respect both structure and intent. This project explores how university knowledge can be crawled, summarized, embedded, connected, and updated without treating ranking as only keyword matching.
Problem
Campus knowledge search needs more than text lookup: recursive crawling, concise page summaries, semantic retrieval, graph relationships, and visible progress when the index changes.
What shipped
Crawler, Gemini-powered summaries, Qdrant/CLIP vector search, knowledge graph ideas, WebSocket crawl progress, dependency checks, setup scripts, and admin utilities.
Evidence
The README documents the crawler, summarization, vector-search, knowledge-graph, WebSocket update, setup, environment, and admin-tool surfaces.
Inspect path
Inspect the README, `web_server.py`, dependency scripts, crawler/summarization paths, Qdrant configuration, and admin scripts for database clearing and summary regeneration.
Boundary
The public README exposes a research/prototype search system, not a production campus search service or validated ranking benchmark.
What changed
Search quality became a systems question: topology, semantics, generated summaries, and update feedback matter together before ranking claims are credible.
Next question
Which signal should be trusted first when graph structure and semantic similarity disagree?
Open public repository
https://github.com/89325516/tum-search