EP93XX. ARM. ®. 9 Embedded Processor Family. EP93xx. User’s Guide 8×8 Key Mtx. ARMT. Maverick. 18 Bit Raster. LCD I/F. Crunch. Notes on making a proper EABI cross compiler for Maverick Crunch (EP, EP93xx) processors. This is a bit of “higher order hacking” and. It’s already configured to build in /opt/toolchains/ directory. This work is based on patches by Martin Guy and tested both on Cirrus demo board for the EP
|Published (Last):||5 February 2009|
|PDF File Size:||4.48 Mb|
|ePub File Size:||6.11 Mb|
|Price:||Free* [*Free Regsitration Required]|
Mainline GCC support has never worked for it but there is a modified compiler available that does and that is able to generate Crunch-accelerated Debian packages. Discussion specific to it usually happens on the linux-cirrus mailing list. The compilers can be downloaded under http: It epp9302 a different instruction set from other floating point accelerators that are found with ARM processors: Five revisions of the silicon were issued: D0, D1, E0, E1 and E2.
The revision of a chip is printed as the 5th and 6th characters of the second line of text on the chip housing. The now rare D0 revision has a more extensive range of hardware bugs than the later revisions; from D1-E2 no further modifications were made to the design of the Maverick unit.
Here we only attempt to work around the bugs in the later series. Cirrus stopped development of its ARM devices on 1st April no joke! Registers It has 16 bit registers, which can be treated as single- or double-precision floating point values, or as or bit integers. Single-precision floats live in the top 32 bits of the register and, when they are written, the lower 32 bits are zeroed. It also has four bit multiply-accumulate integer registers which are not used by GCC.
Instruction set Dp9302 provides instructions to add, subtract, multiply, compare, negate and give absolute value for all these types, to shift the registers in the two integer modes, and to convert between the data types. These operations can only be done between Maverick registers, but data can be copied between Maverick and ARM registers and between Maverick registers and main memory. Operating modes The FPU can operate in several modes, controlled by bits in its status register: Deselects saturating arithmetic for integer operations e9302 selects the usual C-like overflowing.
[linux-cirrus] I’m pretty close with Maverick Crunch on EP – linux-cirrus – FreeLists
The default is saturating, which is wrong for C. The default is signed. Synchronous mode is much slower, but ensures that, if floating point exceptions are enabled and occur, you can be sure to pinpoint the offending instruction.
The default is asynchronous i. Forwarding channels the results of arithmetic operations back to the input of the logic unit as well as to the destination registers so that, when the result of one instruction is used in another soon after, execution is faster. The default is non-forwarding. Instruction format MaverickCrunch instructions are bit words that are interleaved with the regular ARM instrution stream.
It appears as co-processors 4, 5 and maberick and its instruction words in hexadecimal match the regular expression 0x. In GCC output, this is further restricted to 0xe[cde] Most crucially, it fails to take proper account of the way that the FPU sets the condition code registers after a comparison, so the code it generates sometimes gets floating point and bit integer comparisons wrong as well as failing to account for several of the hardware bugs.
GCC does not use: It performs these in ARM registers as usual. It has a -mfix-cirrus-invalid-insns flag, which is supposed to ensure that the two instructions following a branch are not Cirrus one but fails to do so, and that every cfldrdcfldr64cfstrdcfstr64 is followed by one non-Cirrus instruction, which should fix bugs 1 and 2.
There are three versions of it, all based on gcc Some real-life programs compiled with it do seem to work though. The modifications are published as a megabyte tarball from which a single monolithic patch can be derived by diffing it against the mainline source releases. Futaris patches futaris patches for gcc Futaris’ strategy includes disabling all conditional instructions other than branch and all bit integer operations.
Here is how to build a futaris-patched compiler, a summary of their merits, and some benchmarks. It disables all bit integer operations which appear to have more unidentified hardware bugs, as shown by the openssl testsuite. The -mcirrus-di flag enables them, caveat emptor. There is a long description of it at http: Summary of bugs CMP: The bugs The bugs are: The unpublished futaris patches for 4. This thread on binutils mailing list explains why unwind support is needed.
As you can see in Sec 9. The above patch incorrectly calls the iWMMXt pop functions. A new Pop MV registers instruction needs to be added to the table, along with changes to Sec 7. At the moment, only the sp9302 branch git of libunwind supports ARM processors. Myers says on linux-cirrus 31 Mar That illustrates the sort of thing that needs changing to implement unwind support for a new coprocessor.
Obviously you need to get the unwind specification in the official ARM EABI documents first before implementing it in GCC, and binutils will also need to support generating correct information given. Hardware bugs See cirrus. The following is from the EP rev E2 errata: An instruction appears in the coprocessor pipeline, but does not execute for one of the following reasons: It fails its condition code check.
A branch is taken and it is one of the maverifk instructions in the branch delay slot. It is, if and only if both: In the sample I have tested a TS it is not operating in serialised mode by these criteria because no exceptions are enabled. These include all of the following: Ep99302 instruction may be nonexecuted because it is conditional and the condition is false, e.
GCC does not emit conditional Maverick instructions, and the branch case would be covered by mainline’s -mcirrus-fix-invalid-insns flag if that code were not broken: Futaris and Cirrus remove this flag. A test program tickles the bug in both ways on revision E1 silicon. Let the second instruction be an instruction with the same target, but not be executed. Execute a third instruction at least one of whose operands is the target of the previous two instructions.
For example, assume no pipeline interlocks other than the dependencies involving register c0 in the following instruction sequence: Buggy cfadd – cfaddne – cfstr Buggy cfadd – nop – cfaddne – cfstr Buggy cfadd – cfaddne – nop – cfstr OK cfadd – nop mavericl nop – cfaddne – cfstr Buggy cfadd – nop – maveerick – nop – cfstr Buggy cfadd – cfaddne – nop – nop – cfstr OK cfadd – nop – nop – nop – cfaddne – cfstr OK cfadd – nop – nop – cfaddne – nop – cfstr OK cfadd – nop – cfaddne – nop – nop – cfstr OK cfadd – cfaddne – nop – nop – nop – cfstr Buggy cfadd – cfaddne – cfaddne – cfstr Buggy cfadd – cfaddne – maveirck – nop – cfstr OK cfadd – cfaddne – cfaddne – nop – nop – cfstr OK cfadd – nop mavegick cfaddne – cfaddne – cfstr OK cfadd – nop – cfaddne – cfaddne – nop – cfstr OK cfadd – nop – cfaddne – cfaddne – nop – nop – cfstr The second instruction may also not be executed because it follows a branch: GCC doesn’t emit conditional Maverick instructions and the jump case should fixed by mainline’s -mfix-cirrus-invalid-instructions.
Let the first instruction be a serialized instruction that does not execute. For an instruction to be serialized, at least one of the following must be true: The processor must be operating in serialized mode.
EABI on Maverick Crunch – Nuclear Physics Group Documentation Pages
Let the immediately following instruction be a two-word coprocessor load or store. In the case of a load, only the lower 32 bits the first word will be loaded into the target register. Magerick there are serialized ep9032 out there, GCC does not emit conditional Maverick instructions, which just leaves the case of a Maverick instruction being in one of the two slots after mavrrick branch that is taken, which is covered by -mcirrus-fix-invalid-insns. When the coprocessor is not in serialized mode and forwarding is enabled, memory can be corrupted when two types maverixk instructions appear in the instruction stream with a particular relative timing.
Execute an instruction that is a data operation not a move between ARM and coprocessor registers whose destination is one of the general purpose register c0 through c Execute an instruction that is a two-word coprocessor store either cfstr64 or cfstrdwhere the destination register of the first instruction is the source of the store instruction, that is, the second instruction stores the result of the first one to memory.
Finally, the first and second instruction must appear to the coprocessor with the correct relative timing; this timing is not simply proportional to the number of intervening instructions and is difficult to predict in general. The result is that the lower 32 bits of the result stored to memory will be correct, but the upper the 32 bits will be wrong. The value mavericj in the target register maverixk still be correct. Code to enable forwarding under Linux with Maverick support enabled in the kernel, the effect is limited to the process that does this: Under Linux on the sample board I use, forward is disabled by default.
Enabling forwarding in a test program on revision E1 hardware, I have been unable mzverick get this bug to bite. The instructions shift by an unpredictable amount, but cause no other side effects.
Disable interrupts when executing cfldr32 or cfmv64lr instructions.
Avoid executing these two instructions. Do not depend on the sign extension to occur; that is, ignore the upper word in any calculations involving data loaded using these instructions. Add extra code to sign extend the lower word after it is loaded by explicitly forcing the upper word to be all zeroes or all ones, as appropriate.
It is possible to do this selectively in maveick or interrupt handler code.
crosstool-ng for the Maverick Crunch processors
If the instruction preceding the interrupted instruction can be determined, and it is a cfldr32 or cfmv64lrthe instruction may be re-executed or explicitly sign extended before returning from interrupt or exception. Mainline GCC does not emit cfldr32and use of cfmv64lr is disabled as buggy. In three places it is used as the first of a two-instruction sequence: This error can occur if the following is true: The first maverixk must be a coprocessor compare instruction, one of cfcmp32cfcmp64cfcmpsmverick cfcmpd.
GCC does not use the accumulator maverici. This error will occur under the following conditions: The second instruction is not a coprocessor data path instruction. Coprocessor data path instructions include any instruction that does not move data to or from memory or to or from the ARM registers.
The second consecutive instruction: When the error occurs, the result is either coprocessor register or memory corruption.