Database Purpose
Provide a resource for those who like or need to plan their grocery shopping in advance and want to compare prices across all grocery store chains in the DFW region. This will enable those who are trying to stretch their dollar the furthest and still meet all their nutritional needs.
With rapid inflation over the past few years, it is critically important that those who live month-to-month better plan their budget. With this database, users will be able to input their grocery items, and the price for each item at each grocery store (Kroger, Tom Thumb, Aldi, H-E-B, and Whole Foods) to determine their cheapest shopping cart possible.
DRAFT Schema (subject to change)
Store |
||
This table
will store grocery stores. |
||
Column
Name |
Data
Type |
Description |
id |
INTEGER |
Unique
identifier for the store |
name |
TEXT |
Name of
the store |
website |
TEXT |
URL of the
store's website |
last_scraped |
TIMESTAMP |
Date and
time when the store's website was last scraped |
Product |
||
This table
will store the grocery items to be scraped from the web. |
||
Column
Name |
Data
Type |
Description |
id |
INTEGER |
Unique
identifier for the product |
name |
TEXT |
Name of
the product |
unit |
TEXT |
Unit in
which the product is sold (e.g. "lbs", "oz", "each") |
category |
TEXT |
Category
of the product (e.g. "fruits",
"vegetables", "meat") |
Price |
||
This table
will store the price information of the grocery items at each store. |
||
Column
Name |
Data
Type |
Description |
id |
INTEGER |
Unique
identifier for the price |
store_id |
INTEGER |
Foreign
key reference to the Store table |
product_id |
INTEGER |
Foreign
key reference to the Product table |
price |
NUMERIC |
Price of
the product at the store |
scraped_date |
TIMESTAMP |
Date and
time when the price was scraped |
Interface
Ideally, this database will be accessed by a front facing GUI where the grocery items can be inputted and the prices and be displayed. Options are still be explored for how this will be done.
Method
· Create database with the above schema using a cloud database such as MongoDB, Amazon Web Services (AWS), Google Cloud SQL, or Microsoft Azure SQL Database.
· Design a web scraping tool in python that will extract price information from each store every morning for the most up-to-date pricing.
· Have the web scraping tool store and update in the database.
· Design a front facing GUI that will enable users to interact with the database (method TBD).