Database Project Proposal
Bobby Kozub
Information Management
Spring 2023
Project Purpose
• Provide a resource for those who need to plan their grocery shopping in
advance and want to compare prices across all grocery store chains in the
DFW region.
• This will enable those who are trying to stretch their dollar the furthest
with a comparison resource and still meet all their nutritional needs. With
rapid inflation over the past few years, it is critically important that those
who live month-to-month better plan their budget.
• With this database, users will be able to input their grocery items, and the
price for each item at each grocery store (Kroger, Tom Thumb, Aldi, H-E-B,
and Trader Joes) to determine their cheapest shopping cart possible.
Ideally, nutritional information will also be provided for those who are
trying to be more health conscious.
Proposed
Schema
Store
This table will store grocery stores.
Column Name
Data Type
Description
id
INTEGER
Unique identifier for the store
name
TEXT
Name of the store
website
TEXT
URL of the store's website
last_scraped
TIMESTAMP
Date and time when the store's website was last scraped
Product
This table will store the grocery items to be scraped from the web.
Column Name
Data Type
Description
id
INTEGER
Unique identifier for the product
name
TEXT
Name of the product
unit
TEXT
Unit in which the product is sold (e.g., "
lbs", "oz", "each")
category
TEXT
Category of the product (e.g., "fruits", "vegetables", "meat")
Pric e
This table will store the price information of the grocery items at each store.
Column Name
Data Type
Description
id
INTEGER
Unique identifier for the price
store_id
INTEGER
Foreign key reference to the Store table
product_id
INTEGER
Foreign key reference to the Product table
price
NUMERIC
Price of the product at the store
scraped_date
TIMESTAMP
Date and time when the price was scraped
Interface
Ideally, this database will be accessed by a
front facing GUI where the grocery items can
be inputted and the prices and be displayed.
Options are still be explored for how this will
be done.
Methods
Create database with the
above schema using a
cloud database such as
MongoDB, Amazon Web
Services (AWS), Google
Cloud SQL, or Microsoft
Azure SQL Database.
1
Design a web scraping tool
in python that will extract
price information from
each store every morning
for the most up-to-date
pricing.
2
Web scraping tool store
and update in the database
daily.
3
Design a front facing GUI
that will enable users to
interact with the database
(method TBD).
4