Work

tum-search

A TUM-focused search and knowledge-graph system combining recursive crawling, AI summaries, vector search, graph structure, and live crawl feedback.

A TUM-focused search and knowledge-graph system combining recursive crawling, AI summaries, vector search, graph structure, and live crawl feedback.

tum-search

Why this article exists

Search is a good test of whether a system can respect both structure and intent. This project explores how university knowledge can be crawled, summarized, embedded, connected, and updated without treating ranking as only keyword matching.

Problem

Campus knowledge search needs more than text lookup: recursive crawling, concise page summaries, semantic retrieval, graph relationships, and visible progress when the index changes.

What shipped

Crawler, Gemini-powered summaries, Qdrant/CLIP vector search, knowledge graph ideas, WebSocket crawl progress, dependency checks, setup scripts, and admin utilities.

Evidence

The README documents the crawler, summarization, vector-search, knowledge-graph, WebSocket update, setup, environment, and admin-tool surfaces.

Inspect path

Inspect the README, `web_server.py`, dependency scripts, crawler/summarization paths, Qdrant configuration, and admin scripts for database clearing and summary regeneration.

Boundary

The public README exposes a research/prototype search system, not a production campus search service or validated ranking benchmark.

What changed

Search quality became a systems question: topology, semantics, generated summaries, and update feedback matter together before ranking claims are credible.

Next question

Which signal should be trusted first when graph structure and semantic similarity disagree?

Open public repository

https://github.com/89325516/tum-search

AI-readable site index AI index Search index