Make a Hash of it

A really good technique for keeping data safe and secure both at rest and in transit is to hash your data as much as you can.

A hash is a function that converts one value to another. Hashing data is a common practice in computer science and is used for several different purposes. Examples include cryptography, compression, checksum generation, and data indexing.

Source: TechTerms (Date accessed: 15/07/2020)

What forms can hashing take?

From a security perspective this could be the hashing of a single item such as an email address or a password to hashing an entire file. There are also different hashing algorthms you can use, specifically:

  • MD5
  • SHA-1
  • SHA-2

Which is best hash algorithm to use?

It really depends on how secure your data needs to be and if using a thrd party application to decode the hash, for example, what the algorithm the third party application uses. For most applications MD5 will suit most puproses. However in the US, partcularly for governement applications SHA-2 is the recommeded algoritm to use.

What’s the difference?

The main difference between each algorithm is their length. MD5 will produce a 128 message or string and SHA-2 or SHA256 as its commonly known, produces a string length of, wait for it, 256 bits. There is also SHA-512, which surprise, surprise – produces a string length of 512 bits.

Even at face value, the longer the string returned from a hashing algorithm the more scure it will be.

Common applications for Hashing

The great adavantage of hashing is there is no chance of data collisions, in other words no duplicate strings being produced. This therefore makes hashing a great tool for file integrity.

File tampering

Imagine a scenario where you wanted to make sure a document hosted on your server had not been tampered with. You could do this by taking a hash of the file. In so doing you are creating a unique finger print of that file which can be compared every time the file is opened. If the hash varies at any time then that proves the document has been altered in some way.

Protecting software – Hash Manifest

You have written an application and you want to make sure integrity of the software remains consistent and un altered. To do this why not hash every script file in your application? Then create a list of those hashes in a CSV file, a hash manifest, which can be compared against every time the software is distributed. If any of those hashes do not match you know that the application has been tampered with in some way.

Block chaining log files

Logging user activity on an application is a fundamental way of keeping your application secure. Imagine a situation where somebody for spurious reasons gained access to the log files and tried to alter them in some way to cover their tracks. How would you know?

One technique is to hash all the files that came before the newest entry. Also keep a hash of each individual entry thus making the file the child of the previous entries. Each log entry should contain two hash items: the parent (hash of all files above it) and the child which is unique to that specific entry. If any of those hashes differ you can tell the log has been compromised and specifically, which log entry.

Protecting data in third party APIs

Sometimes you may have to store sensitive, personal information on a third party API on the web for verification purposes, such as verifying online expense claims. This data could include contact addresses, phone numbers etc or even passwords used in applications within an internal network. In which case, unless the data has to be physically read by an end user just hash it. So if the database that it is stored on is ever compromised, the data the attacker sees will be useless.