paper_firehose.processors.html_generator

HTML output generation for filtered articles. Based on the original feedfilter.py HTML generation logic.

Classes

HTMLGenerator([template_path])

Generates HTML output files for filtered articles.

class paper_firehose.processors.html_generator.HTMLGenerator(template_path='html_template.html')[source]

Bases: object

Generates HTML output files for filtered articles.

Parameters:

template_path (str)

generate_html_for_topic_from_database(db_manager, topic_name, output_path, topic_description=None)[source]

Standalone method to generate HTML for a topic directly from papers.db. This method can be called independently without going through the filter command.

Parameters:
  • db_manager – Database manager instance

  • topic_name (str) – Name of the topic

  • output_path (str) – Path to output HTML file

  • topic_description (str) – Description for the topic

Return type:

None

generate_html_from_database(db_manager, topic_name, output_path, heading=None, description=None)[source]

Generate an HTML file for filtered entries pulled directly from papers.db.

Parameters:
  • db_manager – Database manager instance

  • topic_name (str) – Name of the topic

  • output_path (str) – Path to the output HTML file

  • description (str) – Optional subheading text to include beneath the page title

  • heading (str)

Return type:

None

generate_pqa_summarized_html_from_database(db_manager, topic_name, output_path, title=None, description=None)[source]

Generate an HTML file with all ranked entries for a specific topic.

Entries with paper_qa_summary show the full PQA summary box. Entries without paper_qa_summary show just the abstract/summary (like ranked HTML). All entries are sorted by rank_score descending.

Return type:

None

Parameters:
  • topic_name (str)

  • output_path (str)

  • title (str)

  • description (str)

generate_ranked_html_from_database(db_manager, topic_name, output_path, heading=None, description=None)[source]

Generate an HTML file with entries sorted by descending rank_score for a topic.

Displays the rank score truncated to two decimals next to each entry.

Return type:

None

Parameters:
  • topic_name (str)

  • output_path (str)

  • heading (str)

  • description (str)

process_text(text)[source]

Process text to escape HTML characters and handle LaTeX code.

Return type:

str

Parameters:

text (str)