Project 2 (Hash Attack)
Objectives
Understand the distinction between collision attacks and pre-image attacks.
Test whether theoretical costs are observed in practice
Demonstrate ability to design and conduct an experiment
Demonstrate ability to convey technical results clearly
Students should form a plan for their experiment, implement the necessary code and present their results using at least one chart or graph.
Background
A collision attack in hashing is when an attacker finds two separate source messages that both hash to the same value. The expected time for this attack is 2^(n/2) where n is the number of bits in the hash digest.
A pre-image attack in hashing is when an attacker is given a specific hash value (usually the hash value of an intercepted message) and is asked to find a source message that hashes to that value. The expected time for this attack is 2^n where n is the number of bits in the digest.
Project Description
In this lab, you will gather experimental data regarding how difficult it is to perform a collision and preimage attack on a hash.
For this lab you will need to do the following:
Create a wrapper for SHA-1 which takes as input the string to hash as well as the number of bits (n) that the hash should be. The output will be the SHA-1 hash of the given string truncated to the provided number of bits.
Using this wrapper, conduct a series of collision and preimage attacks at different bit sizes.
You should test at least 4 different bit sizes between the range 8 to 32.
Choose reasonable values for bit sizes (e.g., 8, 16, 20, 24).
You should have at most 2 trials at bit sizes between 8-15.
Don't attempts attacks against bit sizes that take you hours to complete. Be reasonable.
Keep track of the number of attempts it took for the attack to be successful. Do not track wall-clock time.
You should gather at least 50 samples at each bit size.
Write a 1-2 page report, which describes how your experimental results compare to the theoretical difficulty of attacking a hash of a given size.
Your report should be clear and concise.
Your report should include a graph or table summarize your experimental results.
The graph should compare the bitsize with the average number of attempts it took for each attack type to be successful.
Your graph or table should report on the variance you saw during testing.
Recommendation: Use a logarithmic scale graph.
Recommendation: Plot theoretical cost vs experimental results
Have another person review your report and give you feedback on the clarity of the writing and results.
Include the name of your reviewer in your final written report for full credit.
Starting Point:
SHA-1 typically produces a 160-bit digest. To create our toy SHA-1 implementation, we will use a valid SHA-1 implementation and truncate the results so that the digest is of a reduced size.
Although there are many implementations of SHA-1 available on the web, we suggest using one of the following:
-
SHA-1 in Java (
How to use this code) ( This is an independent implementation. You are free to use it, but we would recommend using the standard Java library over this one)
There are several options available for C++. Here are a few:
-
SHA-1 in Python: Look at the included hashlib, and for python3 understand they bytes type versus the bytearray type (bytearray is mutable). Even though there is a learning curve, I recommend python3.
When choosing your language/library, consider that more common libraries like OpenSSL or those included with Java or C# often have better documentation and are easier to get help with than a less commonly used implementation.
Note that in the independent C++ code provided or the standard Java library you will need to call Reset() after you are done working with one string and are ready to call Update() start another. Otherwise, they will assume that the next value provided in update() should just be appended to what has already been used.
If you find another library or source code that could be helpful to other students, please share them with the Google Group.
Grading
The specification for this assignment has been left open-ended intentionally. We expect that each student will create their own plan for performing the attacks and find an effective way of presenting their results.
Grading Rubric:
20 points for report describing your experiment
10 points will be awarded based on the quality of the writeup that explains your experimental design
10 points for the quality and clarity of the graph(s) presenting your data
5 points if results are consistent with the theory or include an explanation why they might differ from the expected cost
5 points for soliciting one external review of your report
Submission
Submit your written report as a PDF file along with a text file (or zip file) with your source code to Learning Suite.
Tips and Hints
Make sure that you use bit sizes not byte sizes.
If you find results that a drastically different from the theoretical values, take a look at your code and look for implementation errors.
Back to top