INTRODUCTION TO THE ARM INSTRUCTION Fix

ANDREW Due north. SLOSS , ... CHRIS WRIGHT , in ARM Organisation Programmer's Guide, 2004

iii.3.3 MULTIPLE-REGISTER TRANSFER

Load-store multiple instructions can transfer multiple registers between retention and the processor in a unmarried teaching. The transfer occurs from a base address register Rn pointing into memory. Multiple-register transfer instructions are more than efficient from single-register transfers for moving blocks of data around memory and saving and restoring context and stacks.

Load-store multiple instructions tin can increase interrupt latency. ARM implementations do not ordinarily interrupt instructions while they are executing. For example, on an ARM7 a load multiple instruction takes two + Nt cycles, where N is the number of registers to load and t is the number of cycles required for each sequential access to retentiveness. If an interrupt has been raised, then it has no effect until the load-store multiple didactics is complete.

Compilers, such as armcc, provide a switch to control the maximum number of registers beingness transferred on a load-store, which limits the maximum interrupt latency.

LDM load multiple registers {Rd}N * <- mem32[commencement address + iv*N] optional Rn updated
STM salve multiple registers {Rd}N * -> mem32[start accost + 4*Due north] optional Rn updated

Table 3.ix shows the different addressing modes for the load-store multiple instructions. Here North is the number of registers in the listing of registers.

Table three.nine. Addressing style for load-store multiple instructions.

Addressing mode Description Get-go accost Cease accost Rn!
IA increment after Rn Rn + 4* North – 4 Rn + iv* N
IB increment before Rn + 4 Rn + four* N Rn + 4* Northward
DA decrement afterward Rn – iv* N + 4 Rn Rn – 4* N
DB decrement before Rn – 4* Due north Rn – 4 Rn – 4* N

Any subset of the current bank of registers tin be transferred to memory or fetched from memory. The base annals Rn determines the source or destination accost for a load-store multiple instruction. This register can exist optionally updated following the transfer. This occurs when annals Rn is followed by the ! character, similiar to the single-register load-store using preindex with writeback.

Instance iii.17

In this example, register r0 is the base register Rn and is followed past !, indicating that the register is updated after the instruction is executed. You will detect within the load multiple pedagogy that the registers are not individually listed. Instead the "-" grapheme is used to identify a range of registers. In this case the range is from register r1 to r3 inclusive.

Each register can also be listed, using a comma to separate each register within "{" and "}" brackets.

Figure iii.three shows a graphical representation.

Figure 3.3. Pre-condition for LDMIA teaching.

The base register r0 points to retentiveness address 0x80010 in the PRE condition. Retentiveness addresses 0x80010, 0x80014, and 0x80018 incorporate the values 1, 2, and 3 respectively. Subsequently the load multiple instruction executes registers r1, r2, and r3 contain these values as shown in Figure iii.4. The base register r0 now points to memory accost 0x8001c later the concluding loaded give-and-take.

Figure iii.4. Post-condition for LDMIA instruction.

Now supersede the LDMIA educational activity with a load multiple and increment before LDMIB education and use the same PRE weather condition. The first give-and-take pointed to by register r0 is ignored and register r1 is loaded from the next retentivity location as shown in Figure 3.v.

Figure iii.five. Post-condition for LDMIB instruction.

After execution, register r0 now points to the final loaded memory location. This is in contrast with the LDMIA instance, which pointed to the next memory location.

The decrement versions DA and DB of the load-store multiple instructions decrement the start accost and then store to ascending retention locations. This is equivalent to descending memory simply accessing the annals list in reverse order. With the increment and decrement load multiples, you tin can access arrays forwards or backwards. They also permit for stack push and pull operations, illustrated afterward in this section.

Tabular array three.10 shows a listing of load-store multiple instruction pairs. If y'all employ a shop with base update, so the paired load instruction of the same number of registers volition reload the data and restore the base accost arrow. This is useful when you lot need to temporarily salve a group of registers and restore them subsequently.

Table iii.10. Load-store multiple pairs when base of operations update used.

Shop multiple Load multiple
STMIA LDMDB
STMIB LDMDA
STMDA LDMIB
STMDB LDMIA

EXAMPLE three.18

This example shows an STM increase before pedagogy followed by an LDM decrement after educational activity.

The STMIB teaching stores the values 7, 8, 9 to retention. We then corrupt register r1 to r3. The LDMDA reloads the original values and restores the base pointer r0.

Example 3.19

We illustrate the use of the load-store multiple instructions with a cake memory copy case. This instance is a unproblematic routine that copies blocks of 32 bytes from a source accost location to a destination address location.

The example has two load-store multiple instructions, which apply the same increment after addressing mode.

This routine relies on registers r9, r10, and r11 being prepare earlier the lawmaking is executed. Registers r9 and r11 determine the data to be copied, and register r10 points to the destination in retention for the data. LDMIA loads the data pointed to by register r9 into registers r0 to r7. Information technology besides updates r9 to indicate to the side by side block of information to exist copied. STMIA copies the contents of registers r0 to r7 to the destination retentiveness accost pointed to by register r10. It also updates r10 to point to the side by side destination location. CMP and BNE compare pointers r9 and r11 to bank check whether the end of the cake copy has been reached. If the block re-create is complete, then the routine finishes; otherwise the loop repeats with the updated values of register r9 and r10.

The BNE is the branch didactics B with a status mnemonic NE (not equal). If the previous compare instruction sets the condition flags to not equal, the branch instruction is executed.

Figure iii.vi shows the memory map of the block retentiveness copy and how the routine moves through memory. Theoretically this loop can transfer 32 bytes (8 words) in two instructions, for a maximum possible throughput of 46 MB/second beingness transferred at 33 MHz. These numbers assume a perfect memory organisation with fast retentiveness.

Figure three.6. Block memory copy in the retentiveness map.

3.3.iii.1 Stack Operations

The ARM compages uses the load-store multiple instructions to carry out stack operations. The pop operation (removing data from a stack) uses a load multiple didactics; similarly, the button operation (placing data onto the stack) uses a store multiple pedagogy.

When using a stack you take to decide whether the stack will grow up or down in retentiveness. A stack is either ascending (A) or descending (D). Ascending stacks grow towards college memory addresses; in contrast, descending stacks grow towards lower memory addresses.

When you use a full stack (F), the stack arrow sp points to an accost that is the concluding used or full location (i.e., sp points to the last item on the stack). In contrast, if y'all employ an empty stack (E) the sp points to an address that is the start unused or empty location (i.e., it points after the last item on the stack).

At that place are a number of load-store multiple addressing mode aliases available to back up stack operations (meet Table 3.xi). Next to the pop column is the bodily load multiple didactics equivalent. For instance, a full ascending stack would have the notation FA appended to the load multiple instruction—LDMFA. This would be translated into an LDMDA education.

Table 3.11. Addressing methods for stack operations.

Addressing mode Clarification Pop = LDM Button = STM
FA total ascending LDMFA LDMDA STMFA STMIB
FD full descending LDMFD LDMIA STMFD STMDB
EA empty ascending LDMEA LDMDB STMEA STMIA
ED empty descending LDMED LDMIB STMED STMDA

ARM has specified an ARM-Thumb Procedure Telephone call Standard (ATPCS) that defines how routines are chosen and how registers are allocated. In the ATPCS, stacks are defined every bit beingness full descending stacks. Thus, the LDMFD and STMFD instructions provide the pop and push functions, respectively.

Instance 3.20

The STMFD instruction pushes registers onto the stack, updating the sp. Figure iii.seven shows a push onto a full descending stack. You lot can see that when the stack grows the stack arrow points to the concluding full entry in the stack.

Figure 3.7. STMFD education—full stack push operation.

Instance iii.21

In dissimilarity, Figure 3.8 shows a push button operation on an empty stack using the STMED instruction. The STMED instruction pushes the registers onto the stack only updates register sp to signal to the next empty location.

Effigy three.8. STMED instruction—empty stack button functioning.

When handling a checked stack there are iii attributes that demand to exist preserved: the stack base, the stack pointer, and the stack limit. The stack base is the starting address of the stack in retention. The stack pointer initially points to the stack base; as data is pushed onto the stack, the stack pointer descends memory and continuously points to the meridian of stack. If the stack arrow passes the stack limit, and then a stack overflow error has occurred. Hither is a small piece of code that checks for stack overflow errors for a descending stack:

ATPCS defines register r10 as the stack limit or sl. This is optional since it is only used when stack checking is enabled. The BLL0 instruction is a branch with link education plus the condition mnemonic L0. If sp is less than annals r10 after the new items are pushed onto the stack, then stack overflow error has occurred. If the stack pointer goes dorsum past the stack base, then a stack underflow error has occurred.

Read full chapter

URL:

https://www.sciencedirect.com/scientific discipline/article/pii/B9781558608740500046

Instruction prepare

Joseph Yiu , in Definitive Guide to Arm® Cortex®-M23 and Cortex-M33 Processors, 2021

5.seven.7 Multiple load/store

One of the interesting and very useful features in Arm processors is the multiple load/store instructions. This allows y'all, using a single teaching, to read or write multiple information that is continuous in the retentiveness. This helps improve code density, and in some cases could also improve functioning: For example, by reducing retentivity bandwidth for instruction fetches. The Load Multiple registers (LDM) and Store Multiple registers (STM) instructions just support 32-chip data.

To use STM/LDM instructions, we demand to specify the registers being used to concur read/write information using a annals list (shown as {reg_list}). It contains at least ane register, and:

Starts with "{" and ends with "}".

Uses "-" (hypen) to indicate range. For example, R0–R4 means R0, R1, R2, R3, and R4.

Uses "," (comma) to divide each register.

Must non contain an SP (Stack Arrow) and, if write back form is used, the base register Rn must not exist in the "{reg_list}".

For example, the post-obit instructions read address 0x20000000 to 0x2000000F (iv words) into registers R0–R3:

LDR   R4,=0x20000000 ; Set R4 to 0x20000000 (accost)

LDMIA R4!   , {R0-R3}   ; Read 4 words and store them to R0 - R3

The register list can be noncontinuous. For instance, the register list "{R1, R3, R5–R7, R9, R11–12}" contains registers R1, R3, R5, R6, R7, R9, R11, and R12. However, in Armv8-Yard Baseline, only low registers (R0–R7) can be used in multiple load/store instructions.

Similar to other load/store instructions, you can use write back (equally indicated by "!" in the example below) with STM and LDM. For instance,

LDR   R8,=0x8000   ; Ready R8 to 0x8000 (address)

STMIA R8!   , {R0-R3}   ; R8 change to 0x8010 later on the store

In Armv8-M Baseline, the base register is commonly updated automatically later on the execution of the LDM/STM instruction (Tabular array 5.28)—except for the LDM instruction where the Rn (address register) is one of the registers that volition be updated past the read functioning (i.e., Rn is included in {reg_list}).

Table five.28. Multiple load/store instructions.

Educational activity Description Restriction
LDMIA Rn!, {reg_list} Reads multiple registers from the memory address pointed to by Rn (Rn is not in reg_list). Rn is updated to the subsequent address afterwards the final load operation. For Armv8-M Baseline, registers in reg_list and Rn must be one of the depression registers (R0–R7).
STMIA Rn!, {reg_list} Writes multiple registers to the retentivity address pointed to past the Rn. Rn is updated to the subsequent address later the final store operation. Aforementioned as above
LDM Rn, {reg_list} Read multiple registers from the retention accost pointed to past Rn (Rn is in the reg_list, and is updated by i of the aforementioned information read operations). Same as to a higher place. This form of LDM instruction is only allowed if Rn is in the reg_list for Armv8-M Baseline. Armv8-M Mainline does not take such a restriction.

In Armv8-Chiliad Mainline, the LDM and STM instructions back up two types of preindexing:

IA: Increment accost After each read/write

DB: Decrement address Before each read/write

The LDM and STM instructions tin can exist used without a base address write back. For instance, Armv8-M Mainline supports the instructions listed in Table v.29.

Table 5.29. Additional multiple load/store instructions for Armv8-M Mainline.

Instruction Description Restriction
LDMIA Rn,{reg_list} Reads multiple words and the accost is Incremented Afterward (IA) each register is read See listing after this table
LDMDB Rn,{reg_list} Reads multiple words and the address is Decremented Earlier (DB) each register is read See list after this table
STMIA Rn,{reg_list} Writes multiple words and the address is Incremented Later on (IA) each annals is written Encounter list after this table
STMDB Rn,{reg_list} Writes multiple words and the accost is Decremented Earlier (DB) each register is written Encounter list after this table
LDMIA Rn!,{reg_list} Reads multiple words and the address is Incremented After (IA) each register is read. Rn is and so updated to the subsequent address (write-dorsum). Run across list later on this table
LDMDB Rn!,{reg_list} Reads multiple words and the address is Decremented Before (DB) each annals is read. Rn is then updated to the subsequent address (write-back). See list after this table
STMIA Rn!,{reg_list} Writes multiple words and the address is Incremented After (IA) each annals is written. Rn is then updated to the subsequent accost (write-back). See listing later this table
STMDB Rn!,{reg_list} Writes multiple words and the accost is Decremented Before (DB) each annals is written. Rn is and so updated to the subsequent address (write-dorsum). See listing later on this table

There are some restrictions for LDM and STM instructions:

The base register Rn must not be PC

In any STM didactics, reg_list must non contain PC.

In whatever LDM teaching, reg_list must not contain PC and LR at the aforementioned time.

Write-dorsum form (with Rn!) must non exist used if Rn is insidereg_list.

The transfer address must be word aligned.

In general, LDM and STM instructions should be avoided when accessing peripheral registers where the access could accept a side effect, eastward.g., a FIFO register, where a read/write can change the state of the FIFO. This is because Armv8-M BaselineandArmv6-Mprocessors are allowed to abandon and restart an teaching if an interrupt takes identify later the education has started. When the LDM/STM instruction restarts after the interrupt service routine, the LDM/STM access to some of the registers could be erroneously repeated.

In the architectures of Armv8-M Mainline and Armv7-Thousand, the interrupt continuation flake field in the program condition register allows the state of the LDM and STM to be stored and to resume without having to echo transfers that have already been carried out. These processors are not, therefore, subject to the same result highlighted in the previous paragraph.

Read full affiliate

URL:

https://www.sciencedirect.com/scientific discipline/article/pii/B9780128207352000056

Instruction Set

Joseph Yiu , in The Definitive Guide to ARM® CORTEX®-M3 and CORTEX®-M4 Processors (3rd Edition), 2014

Multiple load and multiple store

One of the cardinal advantages of the ARM compages is that it allows you to read or write multiple information that are contiguous in memory. The LDM (Load Multiple registers) and STM (Shop Multiple registers) instructions only back up 32-scrap information. They support two types of pre-indexing:

IA: Increment address After each read/write

DB: Decrement address Before each read/write

The LDM and STM instructions can be used without base address write back (Table 5.15).

Tabular array 5.fifteen. Multiple Load/Store Memory Access Instructions

Examples of Multiple Load/Shop Description
LDMIA Rn,&lt;reg list&gt; Read multiple words from retentivity location specified past Rn. Address Increase After (IA) each read.
LDMDB Rn,&lt;reg listing&gt; Read multiple words from retentivity location specified by Rn. Address Decrement Earlier (DB) each read.
STMIA Rn,&lt;reg list&gt; Write multiple words to retentiveness location specified by Rn. Address increment later each write.
STMDB Rn,&lt;reg listing&gt; Write multiple words to memory location specified past Rn. Address Decrement Before each write.

The <reg list> in Table 5.15 is the register listing. It contains at least one register, and:

Start with "{" and end with "}"

Utilise "-" (hypen) to indicate range. For example, R0-R4 means R0, R1, R2, R3 and R4.

Utilise "," (comma) to separate each register

For example, the following instructions read address 0x20000000 to 0x2000000F (4 words) into R0 to R3:

  LDR   R4,=0x20000000 ; Set R4 to 0x20000000 (address)

  LDMIA R4, {R0-R3}   ; Read 4 words and shop them to R0 - R3

The register list can be non-face-to-face such as {R1, R3, R5-R7, R9, R11-12}, which contains R1, R3, R5, R6, R7, R8, R11, R12.

Like to other load/store instructions, yous can employ write back with STM and LDM. For example:

  LDR   R8,=0x8000     ; Set R8 to 0x8000 (accost)

  STMIA R8!, {R0-R3}   ; R8 change to 0x8010 after the shop

Tabular array 5.16. Multiple Load/Store Memory Access Instructions with Write Back

Example of Multiple Load / Shop with Write Back Clarification
LDMIA Rn!,&lt;reg list&gt; Read multiple words from memory location specified by Rd. Address Increase After (IA) each read. Rn writes back later on the transfer is washed.
LDMDB Rn!,&lt;reg list&gt; Read multiple words from retentivity location specified by Rd. Address Decrement Before (DB) each read. Rn writes dorsum later on the transfer is done.
STMIA Rn!,&lt;reg list&gt; Write multiple words to memory location specified by Rd. Address increment after each write. Rn writes back after the transfer is done.
STMDB Rn!,&lt;reg list&gt; Write multiple words to memory location specified by Rd. Address Decrement Before each write Rn writes back afterwards the transfer is done.

Instructions with multiple Load/Shop memory admission instructions with write back are listed in Table v.16. The sixteen-bit versions of the LDM and STM instructions are limited to depression registers only and always have write back enabled, except when the base annals is ane of the destination registers to be updated by the memory read.

If the floating point unit is present, the instructions in Tabular array v.17 are also bachelor to perform load multiple and store multiple operations to the registers in the floating betoken unit.

Table 5.17. Multiple Load/Shop Memory Admission Instructions for Floating Indicate Unit with Write Back

Example of Multiple Load / Store with Write Back Clarification
VLDMIA.32 Rn, &lt;s_reg list&gt; Read multiple single-precision data. Address Increment After (IA) each read.
VLDMDB.32 Rn, &lt;s_reg list&gt; Read multiple single-precision data. Accost Decrement Earlier (DB) each read.
VLDMIA.64 Rn, &lt;d_reg list&gt; Read multiple double-precision data. Address Increment Later on (IA) each read.
VLDMDB.64 Rn, &lt;d_reg list&gt; Read multiple double-precision information. Address Decrement Before (DB) each read.
VSTMIA.32 Rn, &lt;s_reg list&gt; Write multiple single-precision data. Address increment after each write.
VSTMDB.32 Rn, &lt;s_reg list&gt; Write multiple single-precision data. Address decrement before each write.
VSTMIA.64 Rn, &lt;d_reg listing&gt; Write multiple double-precision data. Address increment after each write.
VSTMDB.64 Rn, &lt;d_reg list&gt; Write multiple double-precision data. Address decrement before each write.
VLDMIA.32 Rn!, &lt;s_reg list&gt; Read multiple single-precision data. Address Increment After (IA) each read. Rn writes back after the transfer is washed.
VLDMDB.32 Rn!, &lt;s_reg list&gt; Read multiple single-precision data. Accost Decrement Before (DB) each read. Rn writes dorsum after the transfer is done.
VLDMIA.64 Rn!, &lt;d_reg list&gt; Read multiple double-precision data. Address Increment Later (IA) each read. Rn writes back after the transfer is done.
VLDMDB.64 Rn!, &lt;d_reg list&gt; Read multiple double-precision data. Accost Decrement Before (DB) each read. Rn writes back after the transfer is done.
VSTMIA.32 Rn!, &lt;s_reg list&gt; Write multiple unmarried-precision data. Accost increment after each write. Rn writes back subsequently the transfer is done.
VSTMDB.32 Rn!, &lt;s_reg list&gt; Write multiple single-precision data. Address decrement before each write. Rn writes back afterwards the transfer is done.
VSTMIA.64 Rn!, &lt;d_reg list&gt; Write multiple double-precision data. Accost increment after each write. Rn writes back afterwards the transfer is washed.
VSTMDB.64 Rn!, &lt;d_reg list&gt; Write multiple double-precision information. Accost decrement before each write. Rn writes back after the transfer is done.

Read full chapter

URL:

https://www.sciencedirect.com/science/commodity/pii/B9780124080829000051

Load/Store and Branch Instructions

Larry D. Pyeatt , in Modernistic Assembly Language Programming with the ARM Processor, 2016

Operations

Name Issue Description
ldmia and ldmfd

a d d r R d

for all iregister_list exercise

i M e m [ a d d r ]

a d d r a d d r + four

cease for

if ! is present then

R d a d d r

end if

Load multiple registers from retentiveness, starting at the address in Rd and increment the accost by four bytes subsequently each load.
stmia and stmea

a d d r R d

for all iregister_list do

Chiliad e m [ a d d r ] i

a d d r a d d r + 4

finish for

if ! is nowadays then

R d a d d r

terminate if

Shop multiple registers in retention, starting at the accost in Rd and increase the address by iv bytes after each store.
ldmib and ldmed

a d d r R d

for all iregister_list do

a d d r a d d r + 4

i K e grand [ a d d r ]

cease for

if ! is present so

R d a d d r

cease if

Load multiple registers from retentiveness, starting at the address in Rd and increment the accost past iv bytes before each load.
stmib and stmfa

a d d r R d

for all iregister_listing do

a d d r a d d r + 4

M e m [ a d d r ] i

end for

if ! is present and so

R d a d d r

end if

Store multiple registers in memory, starting at the address in Rd and increment the address by 4 bytes before each store.
ldmda and ldmfa

a d d r R d

for all iregister_listing practise

i 1000 due east m [ a d d r ]

a d d r a d d r four

finish for

if ! is present then

R d a d d r

end if

Load multiple registers from memory, starting at the address in Rd and decrement the address by four bytes after each load.
stmda and stmed

a d d r R d

for all iregister_list do

Thousand e thou [ a d d r ] i

a d d r a d d r iv

finish for

if ! is nowadays then

R d a d d r

end if

Store multiple registers in memory, starting at the address in Rd and decrement the accost past four bytes later each shop.
ldmdb and ldmea

a d d r R d

for all iregister_list do

a d d r a d d r 4

i Chiliad e thousand [ a d d r ]

end for

if ! is nowadays then

R d a d d r

finish if

Load multiple registers from memory, starting at the address in Rd and decrement the address past iv bytes before each load.
stmdb and stmfd

a d d r R d

for all iregister_list do

a d d r a d d r 4

K east k [ a d d r ] i

cease for

if ! is present then

R d a d d r

terminate if

Store multiple registers in memory, starting at the accost in Rd and decrement the accost by 4 bytes earlier each store.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780128036983000036

The ARM Vector Floating Point Coprocessor

Larry D. Pyeatt , in Modern Assembly Linguistic communication Programming with the ARM Processor, 2016

Operations

Name Effect Description
vldmia

a d d r R d

for iregister_list do

i M east m [ a d d r ]

if single then

a d d r a d d r + iv

else

a d d r a d d r + 8

stop if

end for

if ! is present then

R d a d d r

end if

Load multiple registers from memory starting at the address in Rd. Increment address after each load.
vstmia

a d d r R d

for iregister_listing do

1000 e m [ a d d r ] i

if single and so

a d d r a d d r + four

else

a d d r a d d r + eight

end if

end for

if ! is present then

R d a d d r

end if

Store multiple registers in memory starting at the address in Rd. Increment accost later each shop.
vldmdb

a d d r R d

for iregister_list practice

if single then

a d d r a d d r 4

else

a d d r a d d r viii

end if

i 1000 due east m [ a d d r ]

end for

R d a d d r

Load multiple registers from retentivity starting at the accost in Rd. Decrement accost before each load.
vstmdb

a d d r R d

for iannals_list exercise

if single then

a d d r a d d r iv

else

a d d r a d d r viii

end if

G east yard [ a d d r ] i

stop for

R d a d d r

Store multiple registers in retentivity starting at the address in Rd. Decrement address before each store.

Read total chapter

URL:

https://www.sciencedirect.com/scientific discipline/article/pii/B9780128036983000097