Covers Embedded system related topics Microcontroller annd its peripherals, Basic linux and windows commands, UART,I2C,SPI, CAN etc protocols. GSM, GPRS and 3G mobile communication system. Basic differences. Operating system basic fundamentals.
Tuesday, June 16, 2015
Friday, January 30, 2015
Function Pointers and Callbacks in C
Function pointers are among the most powerful
tools in C, but are a bit of a pain during the initial stages of
learning. This article demonstrates the basics of function pointers, and
how to use them to implement function callbacks in C. C++ takes a
slightly different route for callbacks, which is another journey
altogether.
A pointer is a special kind of variable that holds the address of
another variable. The same concept applies to function pointers, except
that instead of pointing to variables, they point to functions. If you
declare an array, say, int a[10];
then the array name a
will in most contexts (in an expression or passed as a function
parameter) “decay” to a non-modifiable pointer to its first element
(even though pointers and arrays are not equivalent while
declaring/defining them, or when used as operands of the sizeof
operator). In the same way, for int func();
, func
decays to a non-modifiable pointer to a function. You can think of func
as a const
pointer for the time being.But can we declare a non-constant pointer to a function? Yes, we can — just like we declare a non-constant pointer to a variable:
int (*ptrFunc) (); |
ptrFunc
is a pointer to a function that takes no
arguments and returns an integer. DO NOT forget to put in the
parenthesis, otherwise the compiler will assume that ptrFunc
is a normal function name, which takes nothing and returns a pointer to an integer.Let’s try some code. Check out the following simple program:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| #include<stdio.h> /* function prototype */ int func( int , int ); int main( void ) { int result; /* calling a function named func */ result = func(10,20); printf ( "result = %d\n" ,result); return 0; } /* func definition goes here */ int func( int x, int y) { return x+y; } |
gcc -g -o example1 example1.c
and invoke it with ./example1
, the output is as follows:result = 30 |
func()
the simple way. Let’s modify the program to call using a pointer to a function. Here’s the changed main()
function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| #include<stdio.h> int func( int , int ); int main( void ) { int result1,result2; /* declaring a pointer to a function which takes two int arguments and returns an integer as result */ int (*ptrFunc)( int , int ); /* assigning ptrFunc to func's address */ ptrFunc=func; /* calling func() through explicit dereference */ result1 = (*ptrFunc)(10,20); /* calling func() through implicit dereference */ result2 = ptrFunc(10,20); printf ( "result1 = %d result2 = %d\n" ,result1,result2); return 0; } int func( int x, int y) { return x+y; } |
result1 = 30 result2 = 30 |
A simple callback function
At this stage, we have enough knowledge to deal with function callbacks. According to Wikipedia, “In computer programming, a callback is a reference to executable code, or a piece of executable code, that is passed as an argument to other code. This allows a lower-level software layer to call a subroutine (or function) defined in a higher-level layer.”Let’s try one simple program to demonstrate this. The complete program has three files:
callback.c
, reg_callback.h
and reg_callback.c
.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
| /* callback.c */ #include<stdio.h> #include"reg_callback.h" /* callback function definition goes here */ void my_callback( void ) { printf ( "inside my_callback\n" ); } int main( void ) { /* initialize function pointer to my_callback */ callback ptr_my_callback=my_callback; printf ( "This is a program demonstrating function callback\n" ); /* register our callback function */ register_callback(ptr_my_callback); printf ( "back inside main program\n" ); return 0; } |
1
2
3
| /* reg_callback.h */ typedef void (*callback)( void ); void register_callback(callback ptr_reg_callback); |
1
2
3
4
5
6
7
8
9
10
11
| /* reg_callback.c */ #include<stdio.h> #include"reg_callback.h" /* registration goes here */ void register_callback(callback ptr_reg_callback) { printf ( "inside register_callback\n" ); /* calling our callback function my_callback */ (*ptr_reg_callback)(); } |
gcc -Wall -o callback callback.c reg_callback.c
and ./callback
:This is a program demonstrating function callback inside register_callback inside my_callback back inside main program |
register_callback
) where the callback function needs to be called.We could have written the above code in a single file, but have put the definition of the callback function in a separate file to simulate real-life cases, where the callback function is in the top layer and the function that will invoke it is in a different file layer. So the program flow is like what can be seen in Figure 1.
The higher layer function calls a lower layer function as a normal call and the callback mechanism allows the lower layer function to call the higher layer function through a pointer to a callback function.
This is exactly what the Wikipedia definition states.
Use of callback functions
One use of callback mechanisms can be seen here:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
| / * This code catches the alarm signal generated from the kernel Asynchronously */ #include <stdio.h> #include <signal.h> #include <unistd.h> struct sigaction act; /* signal handler definition goes here */ void sig_handler( int signo, siginfo_t *si, void *ucontext) { printf ( "Got alarm signal %d\n" ,signo); /* do the required stuff here */ } int main( void ) { act.sa_sigaction = sig_handler; act.sa_flags = SA_SIGINFO; /* register signal handler */ sigaction(SIGALRM, &act, NULL); /* set the alarm for 10 sec */ alarm(10); /* wait for any signal from kernel */ pause(); /* after signal handler execution */ printf ( "back to main\n" ); return 0; } |
Callback functions can also be used to create a library that will be called from an upper-layer program, and in turn, the library will call user-defined code on the occurrence of some event. The following source code (
insertion_main.c
, insertion_sort.c
and insertion_sort.h
),
shows this mechanism used to implement a trivial insertion sort
library. The flexibility lets users call any comparison function they
want.
1
2
3
4
| /* insertion_sort.h */ typedef int (*callback)( int , int ); void insertion_sort( int *array, int n, callback comparison); |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
| /* insertion_main.c */ #include<stdio.h> #include<stdlib.h> #include"insertion_sort.h" int ascending( int a, int b) { return a > b; } int descending( int a, int b) { return a < b; } int even_first( int a, int b) { /* code goes here */ } int odd_first( int a, int b) { /* code goes here */ } int main( void ) { int i; int choice; int array[10] = {22,66,55,11,99,33,44,77,88,0}; printf ( "ascending 1: descending 2: even_first 3: odd_first 4: quit 5\n" ); printf ( "enter your choice = " ); scanf ( "%d" ,&choice); switch (choice) { case 1: insertion_sort(array,10, ascending); break ; case 2: insertion_sort(array,10, descending); case 3: insertion_sort(array,10, even_first); break ; case 4: insertion_sort(array,10, odd_first); break ; case 5: exit (0); default : printf ( "no such option\n" ); } printf ( "after insertion_sort\n" ); for (i=0;i<10;i++) printf ( "%d\t" , array[i]); printf ( "\n" ); return 0; } |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
| /* insertion_sort.c */ #include"insertion_sort.h" void insertion_sort( int *array, int n, callback comparison) { int i, j, key; for (j=1; j<=n-1;j++) { key=array[j]; i=j-1; while (i >=0 && comparison(array[i], key)) { array[i+1]=array[i]; i=i-1; } array[i+1]=key; } } |
Monday, January 19, 2015
Memory optimizing for embedded system products
Optimization is important to embedded software developers because they
are always facing limited resources. So, being able to control the size
and speed trade-off with code is critical. It is less common for thought
to be given to the optimization of data, where there can be a similar
speed-versus-size tension. This article looks at how this conflict comes
about and what the developer can do about it.
A key difference between embedded and desktop system programming is variability: every Windows PC is essentially the same, whereas every embedded system is different. There are a number of implications of this variability: tools need to be more sophisticated and flexible; programmers need to be ready to accommodate the specific requirements of their system; standard programming languages are mostly non-ideal for the job. This last point points towards a key issue: control of optimization.
Optimization is a set of processes and algorithms that enable a compiler to advance from translating code from (say) C into assembly language to translating an algorithm expressed in C into a functionally identical one expressed in assembly. This is a subtle but important difference.
Data/memory optimization
A key aspect of optimization is memory utilization. Typically, a decision has to be made in the trade-off between having fast code or small code - it is rare to have the best of both worlds. This decision also applies to data. The way data is stored into memory affects its access time. With a 32-bit CPU, if everything is aligned with word boundaries, access time is fast; this is termed ‘unpacked data’. Alternatively, if bytes of data are stored as efficiently as possible, it may take more effort to retrieve data and hence the access time is slower; this is ‘packed’ data. So you have a choice much the same as with code: compact data that is slow to access, or some wasted memory but fast access to data.
For example, this structure:
struct
{
short two_byte;
char one_byte;
} my_array[4];
could be mapped into memory in a number of ways. The C language standard gives the compiler complete freedom in this regard. Two possibilities are: packed, like this:
or unpacked like this:
Unpacked could be even more wasteful. This graphic shows word (16-bit) alignment. Long word (32-bit) alignment would result in 5 bytes being wasted for every 3 bytes of data!
Most embedded compilers have a switch to select what kind of code generation and optimization is required. However, there may be a situation where you decide to have all your data unpacked for speed, but have certain data structures where you would rather save memory by packing. In this case, the language extension keyword packed may be applied, thus:
packed struct
{
short two_byte;
char one_byte;
} my_array[4];
This overrides the optimization setting for this one object.
Alternatively, you may need to pack all the data to save memory, and have certain items that you want unpacked either for speed or for sharing with other software. This is where the unpacked extension keyword applies.
It is unlikely that you would use both packed and unpacked keywords in one program, as only one of the two code generation options can be active at any one time.
Other data optimizations
Space optimization. As previously discussed, modern embedded compilers provide the opportunity to minimize the space used by data objects; this may be controlled quite well by the developer. However, this optimization is only to the level of bytes, which might not be good enough.
For example, imagine an application that uses a large table of values, each of which is in the range 0 to 15. Clearly this requires 4 bits of storage (a nibble), so keeping them in bytes would only be 50% efficient. It is the developer’s job to do better (if memory footprint is deemed to be of greater importance than access time). There are broadly two ways to address this problem.
One way is to use bit fields in structures. This has the advantage that a compiler can readily optimize memory usage, if the target CPU offers a convenient capability. The downside is that bit fields within a structure cannot be indexed without writing additional code, but this is not too difficult. The following code shows how to access nibbles in an array of structures:
struct nibbles
{
unsigned n0 : 4;
unsigned n1 : 4;
unsigned n2 : 4;
unsigned n3 : 4;
} mydata[100];
unsigned get_nibble(struct nibbles words[], unsigned index)
{
unsigned nibble;
nibble = index % 4;
index /= 4;
switch (nibble)
{
case 0:
return words[index].n0;
case 1:
return words[index].n1;
case 2:
return words[index].n2;
case 3:
return words[index].n3;
}
}
A similar put_nibble() function would be required, of course.
The other way to code a solution would be to perform all the bit shifting explicitly in the code, which is really just emulating what the compiler might generate. It is unlikely that a human programmer could produce code substantially more efficient than a modern compiler.
Speed optimization. There is little a developer can do to improve speed of access to data beyond the optimization that the compiler does (i.e., not packing the data for fast access). But one option is to locate data in the fastest available memory. An embedded toolchain includes a linker, which will normally have the flexibility to effect this optimization. This opens up a few possibilities for consideration:
The fastest place to keep data is in a CPU register, but these are in short supply and should be used sparingly. Most compilers make smart choices for register optimization.
RAM is the fastest type of memory in most systems. Obviously, variables tend to be located in RAM, but it may be worthwhile to ensure that constant data is copied into RAM as well. This is commonly done automatically, as code is normally copied from flash to RAM for execution.
Microcontrollers typically have on-chip RAM, which is faster than external memory. So ensuring that speed-critical data is located there makes sense.
Memory is commonly cached into an internal buffer for fast access. Some CPUs permit locking of a cache so that the contents are always immediately available.
A key difference between embedded and desktop system programming is variability: every Windows PC is essentially the same, whereas every embedded system is different. There are a number of implications of this variability: tools need to be more sophisticated and flexible; programmers need to be ready to accommodate the specific requirements of their system; standard programming languages are mostly non-ideal for the job. This last point points towards a key issue: control of optimization.
Optimization is a set of processes and algorithms that enable a compiler to advance from translating code from (say) C into assembly language to translating an algorithm expressed in C into a functionally identical one expressed in assembly. This is a subtle but important difference.
Data/memory optimization
A key aspect of optimization is memory utilization. Typically, a decision has to be made in the trade-off between having fast code or small code - it is rare to have the best of both worlds. This decision also applies to data. The way data is stored into memory affects its access time. With a 32-bit CPU, if everything is aligned with word boundaries, access time is fast; this is termed ‘unpacked data’. Alternatively, if bytes of data are stored as efficiently as possible, it may take more effort to retrieve data and hence the access time is slower; this is ‘packed’ data. So you have a choice much the same as with code: compact data that is slow to access, or some wasted memory but fast access to data.
For example, this structure:
struct
{
short two_byte;
char one_byte;
} my_array[4];
could be mapped into memory in a number of ways. The C language standard gives the compiler complete freedom in this regard. Two possibilities are: packed, like this:
or unpacked like this:
Unpacked could be even more wasteful. This graphic shows word (16-bit) alignment. Long word (32-bit) alignment would result in 5 bytes being wasted for every 3 bytes of data!
Most embedded compilers have a switch to select what kind of code generation and optimization is required. However, there may be a situation where you decide to have all your data unpacked for speed, but have certain data structures where you would rather save memory by packing. In this case, the language extension keyword packed may be applied, thus:
packed struct
{
short two_byte;
char one_byte;
} my_array[4];
This overrides the optimization setting for this one object.
Alternatively, you may need to pack all the data to save memory, and have certain items that you want unpacked either for speed or for sharing with other software. This is where the unpacked extension keyword applies.
It is unlikely that you would use both packed and unpacked keywords in one program, as only one of the two code generation options can be active at any one time.
Other data optimizations
Space optimization. As previously discussed, modern embedded compilers provide the opportunity to minimize the space used by data objects; this may be controlled quite well by the developer. However, this optimization is only to the level of bytes, which might not be good enough.
For example, imagine an application that uses a large table of values, each of which is in the range 0 to 15. Clearly this requires 4 bits of storage (a nibble), so keeping them in bytes would only be 50% efficient. It is the developer’s job to do better (if memory footprint is deemed to be of greater importance than access time). There are broadly two ways to address this problem.
One way is to use bit fields in structures. This has the advantage that a compiler can readily optimize memory usage, if the target CPU offers a convenient capability. The downside is that bit fields within a structure cannot be indexed without writing additional code, but this is not too difficult. The following code shows how to access nibbles in an array of structures:
struct nibbles
{
unsigned n0 : 4;
unsigned n1 : 4;
unsigned n2 : 4;
unsigned n3 : 4;
} mydata[100];
unsigned get_nibble(struct nibbles words[], unsigned index)
{
unsigned nibble;
nibble = index % 4;
index /= 4;
switch (nibble)
{
case 0:
return words[index].n0;
case 1:
return words[index].n1;
case 2:
return words[index].n2;
case 3:
return words[index].n3;
}
}
A similar put_nibble() function would be required, of course.
The other way to code a solution would be to perform all the bit shifting explicitly in the code, which is really just emulating what the compiler might generate. It is unlikely that a human programmer could produce code substantially more efficient than a modern compiler.
Speed optimization. There is little a developer can do to improve speed of access to data beyond the optimization that the compiler does (i.e., not packing the data for fast access). But one option is to locate data in the fastest available memory. An embedded toolchain includes a linker, which will normally have the flexibility to effect this optimization. This opens up a few possibilities for consideration:
The fastest place to keep data is in a CPU register, but these are in short supply and should be used sparingly. Most compilers make smart choices for register optimization.
RAM is the fastest type of memory in most systems. Obviously, variables tend to be located in RAM, but it may be worthwhile to ensure that constant data is copied into RAM as well. This is commonly done automatically, as code is normally copied from flash to RAM for execution.
Microcontrollers typically have on-chip RAM, which is faster than external memory. So ensuring that speed-critical data is located there makes sense.
Memory is commonly cached into an internal buffer for fast access. Some CPUs permit locking of a cache so that the contents are always immediately available.
Subscribe to:
Posts (Atom)