DIGITAL
SIGNAL PROCESSING LABORATORY
AN
INTRODUCTION TO TMS320C31
I. Introduction
Texas Instruments’ TMS320C31 based DSP Start Kit (DSK) includes a board with the TMS320C31 floating-point processor and input and output (I/O) support. The DSK board contains an analog interface circuit (AIC) that provides for programmable ADC and DAC rates, and input and output filtering, all on a single chip.
The TMS320C31, a general purpose digital signal processor, is a member of the third-generation family of floating point processors TMS320C3x. It is a true 32-bit processor capable of performing float-point, integer, and logical operations. With a 40-ns instruction cycle time, the TMS320C31 processor is able to execute operation at a performance rate of up to 50 million floating-point instructions per second (MFLOPS) and 25 million instructions per second (MIPS). Its performance is further enhanced through its 2K words of internal or on-chip memory, and a 24-bit address bus and the direct memory access (DMA) controller. Furthermore, as a floating point processor, the TMS320C31 has a 40-bit extended precision register R0-R7 to avoid overflow due to accumulation, while data has to be scaled down accordingly in a fixed-point processor to avoid overflow.
II. Architecture
1. An Overview
1.1. Register Based Central Processing Unit (CPU)
· Floating-point/integer multiplier to perform single cycle multiplication on 24-bit integers and 32-bit floating point values.
· Arithmetic Logic Unit to perform floating-point, integer, and logical operations.
· 32-bit barrel shifter.
· Internal buses (CPU1/CPU2 and REG1/REG2).
· Auxiliaiy Register Arithmetic Units (ARAU0 and ARAU1).
· CPU registers will be discuss in the sequel.
1.2. Memory Organization
· 64 x 32 bit instruction cache.
· Two 1K x 32 bits RAMs — each RAM is capable of supporting two CPU accesses in a single cycle.
· Program, data, and DMA buses.
1.3. Peripherals
· One serial port.
· Two general purpose 32-bit timers.
1.4. Direct Memory Access
· The on-chip DMA controller is able to read and write to any location in the memory map without interfering with the operation of the CPU.
2. CPU Registers
2.1. The TMS320C3x has 28 registers in a multiport register file that is tightly coupled to the CPU. All these registers can be used as general purpose register and be operated upon by the multiplier and the ALU. However, special functions are given to these registers.
2.2. CPU Registers and Their Functions
i. R0-R7, eight 40-bit extended precision registers that are capable of storing and supporting operations on 32-bit integer and 40-bit floating point numbers.
ii. AR0-AR7, eight 32-bit auxiliary registers that can be accessed by the CPU and modified by the two ARAUs. These general purpose auxiliary registers are commonly used for indirect memory addressing.
iii. DP, 32-bit data page pointer that is used by the direct addressing mode as a pointer to one of the 256 data pages with each page containing 64K words.
iv. IR0 and IR1, 32-bit index registers for addressing.
v. BK. 32-bit block size register used by the ARAU to specify the block size of a circular buffer.
vi. SP, 32-bit system stack pointer that contains the address of the top of the stack.
vii. ST, status register for the CPU.
viii. IE and IF, CPU/DMA Interrupt Enable register and CPU Interrupt Flag register. Both are 32-bit register where a I would set the interrupt while a 0 disables the corresponding interrupt.
ix. IOF, I/O flags register that is used to control the function of the dedicated external pins, XFO andXF1.
x. RC, 32-bit repeat counter is used to specify the number of times a block of code is to be executed.
xi. RS and RE, repeat start address register and repeat end address register. These registers, respectively, contain the starting and ending addresses of a block of code to be executed.
III. Types of Addressing
1. Types of Addressing and Addressing Mode
1.1. There are 6 types of addressing allowed in TMS320C31 to access data from memory, registers, and the instruction word. They are
· Register
· Direct
· Indirect
· Short-immediate
· Long-immediate
· PC-relative
1.2. For application purposes, TMS320C31 contains five groups of addressing modes. Each group could only use some of the above type of addressing.
i. General Addressing Modes (G)
· Register
· Direct
· Indirect
·
Short-Immediate
ii. Three Operand Addressing Modes (T)
· Register
· Indirect
iii. Parallel Addressing Modes (P)
· Register
· Indirect
iv. Conditional-Branch Addressing Modes (B)
· Register
· PC-relative
2. Register Addressing
This type of addressing is used to access or address the CPU registers. The general syntax is:
Instruction
or Assembler Directive Operand/CPU Register’s Name
For example,
FIX R0,R1
converts a
floating-point value in R0 to an equivalent integer value in R1.
3. Direct Addressing
In this addressing, the data address is formed by concatenating the 8 least significant bits of the data
page pointer (DP) with the 16 least significant bits of the instruction word. Symbol @ is used to indicate direct addressing.
For example,
ADDI @ 0x809805, R0
|
Before_Instruction DP=80h Data at 809805h=12345678h R0=0h |
After Instruction DP=80h Data at 809805h=12345678h R0=12345678h |
4. Indirect Addressing
Indirect addressing uses the auxiliary registers (AR0-AR7), index
registers and optional displacement to specify the address of an operand in the
memory. The auxiliary registers mode addressing is represented with the * symbol
(*ARn).
For example,
MPYF3 *ARO++,*AR1++,R1
|
Before_Instruction AR0 points to address 08h Value of AR0=06h AR1 points to address 09h Value of ARl=04h Rl=0 |
After Instruction AR0 points to address 09h Value of AR0=04h AR1 points to address 0Ah Value of AR1=0lh Rl=18h |
The above instruction multiplies the data pointed by AR0 with the data pointed by AR1 and stored the answer in R1. Then, both AR0 and AR1 are increased by one by pointing to 09h and 0Ah, respectively.
Another example,
*AR0++(4)%
*ARO++(5)% *ARO--(3)%
ARO=0 (0th value)
AR0=4 (1st value)
AR0=3 (2nd value)
The % sign indicates circular addressing where a circular buffer that is invincible to the user is utilized. In this example, let initial AR0== and BK=0110 (block size of 6). The algorithm for circular addressing is as follows:
if 0< index + step<BK
index=index + step
else if index + step> BK;
index=index + step- BK;
else if index + step< 0
index=index + step + BK;
Let us assume the size of the circular buffer is 6 and there is an array of 6 elements with values starting from 0 at the 0th element and increasing to 5 at the 6th element. Initially AR0 points to the first element: thus, AR0=0. After *AR0++(4)%, AR0=4 (4+0=4 <6). Then after *AR0++(5)%, AR0=5 (4+5=9>6; thus AR0=9-6=3).
5. Short-Immediate Addressing
In this addressing, the operand is a 16 bit immediate value contained in the 16 least significant bits of the instruction word. The operand can be either as a 2’s complement integer, an unsigned integer, or a floating point number.
For example,
ADDI Ah,RO
Before Instruction After
Instruction
RO=Oh R0=Ah
6. Long-Immediate Addressing
In this addressing, the operand is a 24 bit immediate value contained in the 24 least significant bits of the instruction word.
For example,
BR 7BOOh
Before_Instruction
After
Instruction
PC=Oh PC=7BOQh
7. PC-Relative Addressing
PC-relative is usually used for branching.
For example,
CALL FILTER
Before Instruction After
Instruction
PC=1008h PC=2017h
FILTER address is 2017h FILTER address is 2017h
IV. More Application
Examples
1. General Instruction Syntax Format
Label
Instruction or Assembler Directive
Operand ;Comment
For example,
LOOP SUBI 1,RO ;subtract
1 from RO
Note: Appendix A in “Digital Signal Processing- Laboratory Experiments Using C and the TMS320C31 DSK” contains a summary of the C3x instruction set.
2. Math Instrutions
(Addition, Subtraction and Multiplication)
ADDF3/SUBF3 RO,R2,R1
The
above instruction adds (/substract) the floating
point values in register R0 and (/from) R2 and stores the
resulting result in R1.
MPYF3 *ARQ++,*AR1++,RO
This
instruction accesses the values in the memory pointed by AR0 and AR1 through
indirect access and then multiplies these two data values with each other
before storing the result in R0.
3. Load and Store Instruction
3.1. Consider the two instruction lines below
LDI @DATA_1,ARO
STF R1,*AR1++
· The
first instruction line loads directly the address represented by the label DATA_1 into the auxiliary register AR0.
· The second instruction stores a floating point value R0 into the memory pointed by ARl. Then, AR1 is post-incremented to the next higher memory address (a displacement of one by default).
3.2. Consider now the two instruction lines below
LDI @IN ADDR,AR4
FLOAT *~4,R1
· The
first instruction line loads directly an address represented by the label
IN_ADDR into AR4.
· The second instruction line, then, stores the content in the address specified by AR4 into RI as a floating point value.
4. Repeat and Parallel Instruction
A block of instruction can be repeated a number of times using the repeat block RPTB instruction.
LDI 10,AR2
RPTS AR2
MPY3 *AR0++, *AR1++,RO
|| ADDF
3 R0 , R2, R2
ADDF R0, R2
First, the register AR2 is set to 10. Then, the MPYF3 and ADDF3 instructions are executed 11 times (repeated 10 times) in parallel. Note that the second addition instruction ADDF R0,R2 is executed only once.
V. Assembler Directives
1. Assembler directives begin with a period such as .set, .end, .start and .text. An assembler directive is a message for the assembler and not an instruction. It is resolved during the assembling process and does not occupy memory space as in the case of an instruction.
2. For example,
.include "prog1 . asm" ;include
the program progl.asm
.start
"text",0x809900 ;beginning
address for the text section
.start "data",0x809C00
;beginning address for the data section
A .set 40 ;set
A=40
VI. The DSK Software Tools and Required Exercises
1. To assemble an assembly program type:
dsk3a filename.asm
For example:
dsk3a matrix.asm
The corresponding executable file matrix.dsk will be created by the assembler.
2. To load an executable file into the DSK debugger to run type:
dsk3d
reset
load filename.dsk
run
For example:
dsk3d
reset
load matrix.dsk run
3. To load and run an executable file using the boot loader type:
dskload filename.dsk
For example:
dsk3a sine4p.asm
dskload
sine4p.dsk
This procedure does not access the debugger and reset the C31 processor, so the erroneous values can result. In this case, use debugger to reset C31 processor.
Note.
1. The filename’s extension (.asm or .dsk) is not necessary.
2. These commands are
not case-sensitive.
3. The C31 processor
has to be reset before running a program every time.
4. To check the memory type:
mem addr
For example:
memd 0x809c00 or memf
0x809c00 or memx 0x809c00
checks the contents in memory starting at the address 809c00 and display the result in 32-bit decimal format or in float decimal format or in 32-bit hex format. respectively. The hexadecimal notation Ox is necessary in debugger command.
5. To enter DOS shell type:
dos
To return to debugger type:
exit
6. To exit the debugger type:
quit
7. The introduction to the DSK debugger
1. Access
sub-windows
There are four sub-windows on DSK debugger window screen. The program code is shown with in the DISASSEMBLY window. The CPU REGISTERS window shows the value contained in every CPU register. The MEMORY window displays the contents in memory. All commands should be entered within COMMAND window that is accessed when the DSK debugger is successfully invoked. Access the other three windows using Alt-D, Alt-C and Alt-M, respectively. Return to the COMMAND window pressing Esc.
2. Some commonly used function keys
Within COMMAND window press F5 to run the program, press F8 to single-step run the program, and press F2 or F3 to display the CPU registers in 32-bit hex format or float format, respectively.
Within DISASSEMBLY window press F2 to set a breakpoint, press F4 to run until the breakpoint.