Continue to Site

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

  • Welcome to our site! Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Running Median

Status
Not open for further replies.

misterT

Well-Known Member
Most Helpful Member
I wanted to implement running median (windowed). But google came up with all kinds of fancy algorithms.. I thought they were just too fancy. This is my first solution to update median, mean, min and max of the last N datapoints:

The main idea is simple.. keep a double buffer of the data where the entries are ordered. Then it is easy to remove and add samples while keeping the samples ordered.

C:
struct rolling_window {
    volatile uint16_t *data;
    volatile uint8_t   index;
    volatile uint8_t   size;

};

struct rolling_median {
    struct rolling_window  window;
    uint16_t              *ordered_data;
    uint16_t               median;
    uint32_t               accum;
};

/**
* Updates Rolling Median with one data entry.
*
* @param[in]  object  Pointer to rolling_median structure.
* @param[in]  data    New data.
*
*/
void
rolling_median_update(struct rolling_median *object, uint16_t data)
/* Updates Rolling Median with one data entry. */
{
    uint16_t temp;
    temp = rolling_window_put_and_get(&(object->window), data);

    find_and_replace_ordered(object->ordered_data, temp, data, object->window.size);

    object->accum += data;
    object->accum -= temp;
    object->median = (object->ordered_data)[(object->window.size)/2];
}
/******************************************************************************/



void find_and_replace_ordered(uint16_t *data, uint16_t old, uint16_t new, uint8_t size)
{
    uint8_t i;

    /* Best case */
    if (old == new) { return; }


    /* Find the (first) index place of the old entry */
    /* This can be optimized using binary search etc. */
    i=0;
    while(old != data[i]) {
        i++;
    }


    if (new > old) {

        /* Move everything one place */
        while (i<(size-1)) {
            if (data[i+1]>=new) {
                data[i] = new;
                return;
            }
           data[i] = data[i+1];
            i++;
        }
    }
    else
    {
        /* Move everything one place */
        while (i>(0)) {
            if (data[i-1]<=new) {
                data[i] = new;
                return;
            }
            data[i] = data[i-1];
            i--;
        }
    }

    data[i]=new;
}
/******************************************************************************/

/**
* Writes one data entry in Rolling Window and returns the oldest entry.
*
* @param[in]  object  Pointer to rolling_window structure.
* @param[in]  data    Data to be written in buffer.
*
* Return uint16_t  Oldest data entry that is removed from the window.
*
*/
uint16_t rolling_window_put_and_get(struct rolling_window *object, uint16_t data);

#define NEXT_INDEX(obj) \
(obj->index) ? (obj->index--) : ((obj->index) = ((obj->size)-1))

uint16_t
rolling_window_put_and_get(struct rolling_window *object, uint16_t data)
/* Writes one data entry in Rolling Window and returns the oldest entry. */
{
    uint16_t temp;

    temp = (object->data)[object->index];
    (object->data)[object->index] = data;
    NEXT_INDEX(object);

    return temp;
}
/******************************************************************************/

This is simple test plot using the algorithm:
**broken link removed**
blue: raw data
red: 64 sample running average
green: 64 sample running median

Any ideas or interest how to improve this.. is this useful at all :)?
I can post more code if somebody shows interest.


Test code:
C:
    /* Initialize the sine table */
    for (int i=0; i<256; i++)
    {
        sineTable[i] = (int8_t)(127.0 * sin(6.283*((float)i)/256.0));
    }

    initialize_uart(B38400);
    sei();

    /* Infinite loop */
    for(;;)
    {
        fbi(PORTB, 0);
        _delay_ms(50);
        accumulator += increment;
        acc2 += inc2;
        temp = sineTable[accumulator>>8] + 128;
        temp += sineTable[acc2>>8] + 128;
     
        if(rand()>30000) {
            temp+= (rand()>>4);
        }
        temp+= (rand()>>10);
        rolling_median_update(&samples, (uint16_t)temp);
     
        temp2 = (uint16_t)((samples.accum)/WINDOW_SIZE);
        printf("%u\t", temp);
        printf("%u\t", temp2);
        printf("%u\n", (samples.median));
    }
 
Last edited:
Thats a nice to follow example! thank you. I like to read good code
 
Yes, I think it would be a good idea to post the complete code.

Maybe I post it to the Code Repository when the module is more "final" than "in development".

I am mainly interested to find out more ideas to improve the function that updates the already ordered array of numbers. I'd like to assume that the input data is continuous data. So consecutive new entries (and consecutive entries that are removed) are close to each other.
C:
/* Array of data is already ordered. We want to remove one existing entry (assumed to exist in the array) and add new entry to the ordered set */
void find_and_replace_ordered(uint16_t*data, uint16_t old, uint16_t new, uint8_t size)
{
   uint8_t i;

   /* Best case */
   if(old == new){return;}

   /* Find the (first) index place of the old entry */
   /* This can be optimized using binary search etc. */
    i=0;
   while(old != data[i]){
        i++;
   }


   if(new > old){
       /* Start moving UP the array until we find the place for the new entry */
       while(i<(size-1)){
           if(data[i+1]>=new){
               /* Found our spot for new data in the ordered array */
                data[i]= new;
               return;
           }
           data[i]= data[i+1]; /* move the next data one down, replacing the removed entry */
            i++;
       }
   }
   else
   {
       /* Start moving DOWN the array until we find the place for the new entry */
       while(i>(0)){
           if(data[i-1]<=new){
               /* Found our spot for new data in the ordered array */
                data[i]= new;
               return;
           }
            data[i]= data[i-1]; /* move the next data one up, replacing the removed entry */
            i--;
       }
   }
    /* This is either first or last index of the array */
    data[i]=new;
}
 
Last edited:
I am mainly interested to find out more ideas to improve the function that updates the already ordered array of numbers.

If data are coming slow then you probably don't need any optimizations.

If data are coming fast you probably don't need a mediane on every step, so an improvement would be to break it into sub-arrays. Say, your rolling is 256, but you need mediane only every 64-th count. Then you maintain 4 arrays. When one of them fills in, you sort it, then either merge them into a big array or use some clever algorithm to find a mediane without merging.
 
If data are coming slow then you probably don't need any optimizations.

If data are coming fast you probably don't need a mediane on every step, so an improvement would be to break it into sub-arrays. Say, your rolling is 256, but you need mediane only every 64-th count. Then you maintain 4 arrays. When one of them fills in, you sort it, then either merge them into a big array or use some clever algorithm to find a mediane without merging.
I'm not yet completely sure what I need, but I would like to get that one algorithm efficient so I can waste processor time on other things. And with big buffer (large time constant) efficiency comes more important. And with battery powered device I want to go to sleep as fast as possible (not important right now, but someday). I have few ideas, but I was hoping to hear some "fresh" ideas too.

At the end my interest is in improving the algorithm in general (for my "library".. and I will share it), I'm not concerned whether I need it or not in the current project.
 
Last edited:
I'm not yet completely sure what I need, but I would like to get that one algorithm efficient so I can waste processor time on other things.

IMHO, efficiency doesn't exist by itself, but is always related to usage.

Say, you have created a very efficient algorithm which calculates median at every step. Then you use it in your application to sample at 100 kHz and output statistics on LCD every second. Your canned algorithm will have to run 100000 times a second, which will put much more strain on processor than any very inefficent algorithm that is run only once a second when LCD demands it.

About the median. The most efficient algorithm will probably not include maintaing fully sorted array all the time. Are you interested in the median, or in maintaining the running sorted array?

Another important consideration is what exactly you want optimize. Is it the worst case (so that the calucation would be guaranteed to fit between interrupts), or the average load (so that it could run in the background with less load)? These algorithms might be different too.
 
Are you interested in the median, or in maintaining the running sorted array?
Both. Forget practical applications. This does not yet have place in well defined practical problem (for me at least). With efficiency I mean the "idea" the "algorithm efficiency" the "big O notation". That is computer science.. and when you nail that, then you can start optimizing for application/processor specific solutions.

Another important consideration is what exactly you want optimize.
I would like to optimize for "continuous data", we can assume that consecutive samples are close to each other. Smooth continuous data would be the "best case".. full scale random noise would be the "worst case".
 
Last edited:
Both. Forget practical applications. This does not yet have place in well defined practical problem (for me at least). With efficiency I mean the "idea" the "algorithm efficiency" the "big O notation". That is computer science.. and when you nail that, then you can start optimizing for application/processor specific solutions.

From the big O viewpoint, it is definitely will be O(N) where N is the "frame size", so you can always increase or decrease speed by changing N.

If you want speed, for sorting:

I would use a cicrular buffer which would hold the structures. Each structire would contain - the sample, a pointer to the next higher sample in the sort order ans a pointer to the next lower element in the sort order. So, the elements of the circular buffer would be linked in a double-linked list.

I would do it like this:

- Advance circular buffer pointer
- Remember the sample at the pointer
- Remove the location from the double-linked list
- Save new sample in the current location circular buffer
- Find the nearest element to the new sample of three known (head of the double-linked list, tail of the double-linked list, old sample)
- Walk from the nearest element in the correct direction through the double-linked list until the soring order place for the new sample is found
- When the place is found, insert the current location into the double-linked list at that location.

It'll be about N/4 list-walking steps per sample (good for large N) plus some O(1) overhead.

I would like to optimize for "continuous data", we can assume that consecutive samples are close to each other. Smooth continuous data would be the "best case".. full scale random noise would be the "worst case".

The calculation of median assumes an existance of underlying statistical distributions which samples are drawn from and samples has to be independent (and thus uncorrelated). High autocorrelations will make calculation of median useless unless your sample is long enough to negate autocorrelations.
 
Status
Not open for further replies.

New Articles From Microcontroller Tips

Back
Top