Continue to Site

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

  • Welcome to our site! Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

algoritham to compress 12 digit hexa value

Status
Not open for further replies.

sreenivasulun

New Member
let me give my requirement in detailed way....

Actually I want a algoritham to compress 12 digit hexa value into less than 9 digit decimal value.

example : this hexa value "000A3A7C0ED0" should be compressed into less than 9 digit decimal value like "234876123"

again if I uncompress decimal value it has to give same hexa value....

Thanks in advance
 
Without suitable restrictions on the range of possible values there is no known algorithm that will provide a 1:1 map from 2^48 possibilities onto 2^36 code words. The mapping must be 1:1 because you need to have a unique inverse map.
 
sreenivasulun said:
let me give my requirement in detailed way....

Actually I want a algoritham to compress 12 digit hexa value into less than 9 digit decimal value.

example : this hexa value "000A3A7C0ED0" should be compressed into less than 9 digit decimal value like "234876123"

again if I uncompress decimal value it has to give same hexa value....

Thanks in advance

Thats not really how compression works. Although if you assign some probabilities to your numbers you can map it to like a binary 0 or 1. But some of your other numbers in your space will end up being much longer than the number you're trying to compress based on the probability of its occurance, but it if doesn't occur very often then you may not have to worry about that.

You can't simply map a larger space to a smaller space like the post above this one pointed out, thats not how compression works. There would be no way of reversing the compression to get back your hex number, which is the problem.
 
Last edited:
Change it to a higher base. Eg. Base 36, 0 > Z. It will take up less characters as the place value of each is 36 rather than 16.
I hope you understand what I mean... lol.
 
Compression is based on the fact that data is repeated within your subset. For instance, in your example, you could take the original string and calculate the number of occurances of each:
000A3A7C0ED0
Code:
0->5
A->2
3->1
7->1
C->1
E->1
D->1
Then, you sort this index by occurance in descending order (as it is listed above) and assign a key value to each one. The idea being that the key is smaller than the actual chunk of data. In your case, there are 7 unique keys that require 3 bits to cover, and you might assign them as follows:

0 = 000
A = 001
3 = 010
7 = 011
C = 100
E = 101
D = 110


Now that we have a working dictionary, we can compress your data as follows:

Code:
0    0     0    A    3    A    7    C    0     E    D    0
000 000 000 001 010 001 011 100 000 101 110 000

A quick regrouping back into hex, yields:
00145C170

which is 3 characters shorter than the original. However, the usefullness of this is pretty much lost completely in this example because in order to revert back to the original, you need to send the dictionary as well, which in this case would significantly increase the size of what you need to send.

You can also play with the size of each chunk of data as well, and you will get different compression ratios as a result. For instance, in your data we might make the chunk size only half of a nibble or we might make it a full byte. If you were to do it with half a nibble, your compression becomes:
Code:
uncompressed message (binary) = 00 00 00 00 00 00  10 10 00 11 10 10  01 11 11 00 00 00 11 10 11 01 00 00

dictionary:

00 -> 12 
10 -> 5 
11 -> 5
01 -> 2

In this case, the compression would result in a message that is the same size because the key in the dictionary is the same size as the data itself (we would need 4 keys, or two bits). Not to mention that now you have to transmit the dictionary, so your total size has actually increased.

If we make the chunk size a full Byte, your data is compressed like this:

Code:
00 = 000
0A = 001
3A = 010
7C = 011
0E = 100
D0 = 101

compressed message= 0000 0001 0100 1110 0101 = 014E5

In this case, the compressed message is only 5 characters, but the dictionary may be larger. You could go the extreme route and make the chunk size the same size as your message and just transmit the data as "0", being the index of the only key in the dictionary, but then to transmit the dictionary you would be essentially just be sending the uncompressed message.

If you were sending a rather large file with lots of repeated data, you will save lots of space. For example, if the data you are sending is ASCII text, chances are that you are using a small subset of ASCII (ie the alphabet, and numbers, some symbols) substantially more often than some of the more rare characters and it would be advantageous to compress them in this fashion. If you knew you were sending the english language, for instance, you could analyze a large block of text (an ebook or something) and use the distribution of characters to build your dictionary.

So, to answer your question, the only way you could effectively compress your data would be if you knew something about the distribution of data, and could either create a dictionary that is shared between the encoder/decoder or dynamically generated and transmitted alongside the data on the fly.

I've written a nice compression utility in one of my university courses, but I dont remember the precise implementation, and I'm sure I've omitted some details in my brief description above. My implementation relied on building the dictionary with a Huffman tree which uses ordered insertion to create the dictionary, and then you get keys by traversing the tree. See wikipedia: https://en.wikipedia.org/wiki/Huffman_coding for more.
 
Status
Not open for further replies.

Latest threads

New Articles From Microcontroller Tips

Back
Top