Database Purpose

Provide a resource for those who like or need to plan their grocery shopping in advance and want to compare prices across all grocery store chains in the DFW region. This will enable those who are trying to stretch their dollar the furthest and still meet all their nutritional needs.

With rapid inflation over the past few years, it is critically important that those who live month-to-month better plan their budget. With this database, users will be able to input their grocery items, and the price for each item at each grocery store (Kroger, Tom Thumb, Aldi, H-E-B, and Whole Foods) to determine their cheapest shopping cart possible.

DRAFT Schema (subject to change)

Store

This table will store grocery stores.

Column Name

Data Type

Description

id

INTEGER

Unique identifier for the store

name

TEXT

Name of the store

website

TEXT

URL of the store's website

last_scraped

TIMESTAMP

Date and time when the store's website was last scraped

Product

This table will store the grocery items to be scraped from the web.

Column Name

Data Type

Description

id

INTEGER

Unique identifier for the product

name

TEXT

Name of the product

unit

TEXT

Unit in which the product is sold (e.g. "lbs", "oz", "each")

category

TEXT

Category of the product (e.g. "fruits", "vegetables", "meat")

Price

This table will store the price information of the grocery items at each store.

Column Name

Data Type

Description

id

INTEGER

Unique identifier for the price

store_id

INTEGER

Foreign key reference to the Store table

product_id

INTEGER

Foreign key reference to the Product table

price

NUMERIC

Price of the product at the store

scraped_date

TIMESTAMP

Date and time when the price was scraped

 

Interface

Ideally, this database will be accessed by a front facing GUI where the grocery items can be inputted and the prices and be displayed. Options are still be explored for how this will be done.

Method

·         Create database with the above schema using a cloud database such as MongoDB, Amazon Web Services (AWS), Google Cloud SQL, or Microsoft Azure SQL Database.

·         Design a web scraping tool in python that will extract price information from each store every morning for the most up-to-date pricing.

·         Have the web scraping tool store and update in the database.

·         Design a front facing GUI that will enable users to interact with the database (method TBD).