Skip to content

Cooking Notes section incorrectly includes user comments #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
niklas-joh opened this issue Apr 18, 2025 · 2 comments
Open

Cooking Notes section incorrectly includes user comments #8

niklas-joh opened this issue Apr 18, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@niklas-joh
Copy link
Owner

niklas-joh commented Apr 18, 2025

Issue Description

The Cooking Notes section in plant data includes user comments and questions that are not related to cooking instructions. For example, in the Beets entry, the Cooking Notes section contains user questions about Japanese beetles, growing conditions, and other unrelated topics.

Expected Behavior

The Cooking Notes section should only contain information related to cooking and preparing the plant, not user comments or questions.

Current Behavior

Currently, the Cooking Notes section includes user comments and questions that are unrelated to cooking. For example, in the Beets entry, there are comments about:

  • Questions about why beets aren't forming round roots
  • Methods for growing beets
  • Questions about Japanese beetles on grape vines
  • Stories about using ducks to control Japanese beetles

Proposed Solution

Filter out user comments from the Cooking Notes section during the scraping or transformation process. This could be done by:

  1. Identifying patterns that indicate user comments (e.g., question format, usernames)
  2. Only including content that appears to be official cooking instructions
  3. Possibly creating a separate section for user tips if they're valuable

Affected Files

  • src/notion/transformer.py
  • scripts/sync_to_notion_requests.py
  • Any scraper files that extract the Cooking Notes content
@niklas-joh niklas-joh added the bug Something isn't working label Apr 18, 2025
niklas-joh added a commit that referenced this issue Apr 18, 2025
…a new function to identify and remove user comments and questions from the Cooking Notes section, keeping only cooking-related content and nutrition information.
@niklas-joh
Copy link
Owner Author

Fixed in commit f29f478. Added a new function to filter out user comments from the Cooking Notes section, keeping only cooking-related content and nutrition information. The filter identifies user comments based on patterns like questions, first-person pronouns, and non-cooking topics, while preserving content related to cooking instructions and nutrition.

@niklas-joh
Copy link
Owner Author

This issue has been reopened for further review. The previous fix (adding the filter_user_comments_from_cooking_notes function in src/processors/content_cleaner.py) is still in place. Please review if additional changes are needed or if the current implementation needs to be modified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant