WittySearch Documentation

Visit The Projects Github

Backend Architecture (app.py)

The backend is powered by Python and Flask. It handles routing, file system traversal, and data pagination.

The search_local_files Function

This is the core engine of WittySearch. It uses os.walk() to recursively scan the designated SEARCH_DIR. For each file, it performs several operations:

  • Checks if the file extension matches a user's filetype: filter, taking aliases into account (e.g., treating .htm and .html as the same).
  • Calculates file size in KB and extracts the last modified date using os.stat().
  • Searches inside text-based files to generate a contextual snippet around the search query.

HTML Favicon Extraction

A standout feature is the regex-based favicon scraper. If an HTML file is found, the script reads the content and uses regular expressions (<link[^>]+>) to locate rel="icon" or rel="shortcut icon" tags. It then resolves the relative or absolute path so the search engine can display the website's logo in the results.

OS Integration & Routing

The app includes specialized routes to serve files securely: