Sunday, May 31, 2020

Week Twenty-One - CST 363 - Database Systems

Someone described normalization of a DB design as "a column depends on the key, the whole key, and nothing but the key, so help me Codd."  Explain in your words what 3rd normal form is and why it is important.

Each normal form builds upon the one before. Third normal form introduces non-transitive dependency. In other words, all non-primary fields should be dependent on the primary key. This plays an important role in maintaining data integrity. When duplicated data changes, there is a risk of updating only some of the data, especially if it is spread out in many different places in the database. We avoid this error by storing data only once in the proper table, and then referencing it as needed.

If an index will speed up data retrieval, why is it not a good idea to index every column? 

If indexes are not properly created, the server may have to go through more records to retrieve the data requested by a query. This uses more hardware resources (processor, memory, disk, and network) and takes more time. A wrong index can be an index created on a column that doesn’t provide easier data manipulation or an index created on multiple columns which instead of speeding up queries, slows them down.

What is an SQL view and why is it useful?

A view is a virtual table, defined by a query, that does not exist until it is invoked by name in an SQL statement. One important use of views is security. Views can be made accessible to users while the underlying tables are not. This allows database administrators to give users only the data they need, while protecting other data in the same table.

Tuesday, May 26, 2020

Week Twenty - CST 363 - Database Systems

When coding a Java program that will perform a SELECT statement that return multiple rows,  what are the steps needed?  The first is to make a connection to the database and the last is close the connection.  What are the other steps?

1. Make a connection to the database
2. Create a PrepatedStatement object with the SQL SELECT statement
3. Set the parameters for the PrepatedStatement object with your Java variables
4. Execute the statement using a ResultSet object and the executeQuery() method - executeUpdate(), or execute() can be used for other types of SQL statements.
5. Process the ResulSet.
6. Close the connection.


What is a parameterized SQL statement? 

A parameterized query (also known as a prepared statement) is a means of pre-compiling a SQL statement so that all you need to supply are the "parameters" (think "variables") that need to be inserted into the statement for it to be executed. It's commonly used as a means of preventing SQL injection attacks.


Do a google search for "SQL injection attack".   What is an "injection attack" and how do parameterized statements help to prevent such security attacks?

SQL injection, also known as SQLI, is a common attack vector that uses malicious SQL code for backend database manipulation to access information that was not intended to be displayed. 

The root of the SQL injection problem is in the mixing of the code and the data. With prepared statements the data coming in from the user is treated as data - and there is no way it can be intermixed with your application SQL and/or be interpreted as SQL. The prepared statements "prepare" the SQL query first to find an efficient query plan and user data is guaranteed to never be executed.


Tuesday, May 19, 2020

Week Nineteen - CST 363 - Database Systems


1.  What do you think is the most useful feature in SQL and why?  Grouping, Joins, Sub-selects, something else? 

I’ve only been introduced to this software within the past few weeks so its hard to say. Joins are undoubtedly a fundamental aspect of working with databases and it would be hard to imagine a functional and efficient database software without some variant of the join operation.

2.  What was the most difficult aspect of SQL language to learn? What makes it difficult?

Great question. So far, I have found that each individual function in SQL is simple enough to learn. The tricky part seems to be pulling it all together to form a complex select statement that solves the problem.

3.  Can you think of a question you might want to ask of a database that would be difficult or impossible to do in SQL?  

The views in SQL have limited functionality. It is generally not possible to use the ‘order by’ clause in a view.

Tuesday, May 12, 2020

Week Eighteen - CST 363 - Database Systems


Prompt

SQL has the flexibility to join tables on any column(s) using any predicate (=, >, < ).  As long the column(s) are type compatible.   98% of the time the join uses primary key of one table, foreign key of the other table and equal predicate.  Think of example where joining on something other than keys would be needed.  Write the query both as English sentence and SQL.  If you can't think of your own example, search the internet for an example.

Response

One example of such a query could be utilized in online retail to determine which customers made an order within the first day of creating their online account. The customer table would have an account_created_date field and the orders table would have an order_date field. Comparing these non-primary keys would give us this bit of information.

SELECT customer_name
FROM customers c JOIN orders o
              ON c.account_created_date = o.order_date;

Tuesday, May 5, 2020

Week Seventeen - CST 363 - Database Systems


Relational tables and spreadsheets look similar with both having rows and columns.  What are some important differences between the two?

Spreadsheets are a simplified version of a relational database. One important difference between database tables and spreadsheets is that, unlike spreadsheets, databases are relational, meaning that data between tables can be linked and cross-referenced. Additionally, databases have more efficient built-in mechanisms for additions, deletions, and sorting.

What are some important reasons for using a database rather than just using files?  (from orientation slides)

As mentioned above, databases are relational which gives the user more flexibility when for links and cross-references. Also, databases allow for quick and easy additions, deletions, and sorting.

An SQL select statements has SELECT, FROM, WHERE, ORDER BY, LIMIT parts.  Which are optional and which are required?

Only the SELECT clause is truly required. Generally, the FROM clause is an important piece as well, but there are cases where it is not required.