Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Vector processor
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Fault (or Fail) First === Introduced in ARM SVE2 and RISC-V RVV is the concept of speculative sequential Vector Loads. ARM SVE2 has a special register named "First Fault Register",<ref>[https://developer.arm.com/tools-and-software/server-and-hpc/compile/arm-instruction-emulator/resources/tutorials/sve/sve-vs-sve2/single-page Introduction to ARM SVE2]</ref> where RVV modifies (truncates) the Vector Length (VL).<ref>[https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#unit-stride-fault-only-first-loads RVV fault-first loads]</ref> The basic principle of {{Not a typo|ffirst}} is to attempt a large sequential Vector Load, but to allow the hardware to arbitrarily truncate the ''actual'' amount loaded to either the amount that would succeed without raising a memory fault or simply to an amount (greater than zero) that is most convenient. The important factor is that ''subsequent'' instructions are notified or may determine exactly how many Loads actually succeeded, using that quantity to only carry out work on the data that has actually been loaded. Contrast this situation with SIMD, which is a fixed (inflexible) load width and fixed data processing width, unable to cope with loads that cross page boundaries, and even if they were they are unable to adapt to what actually succeeded, yet, paradoxically, if the SIMD program were to even attempt to find out in advance (in each inner loop, every time) what might optimally succeed, those instructions only serve to hinder performance because they would, by necessity, be part of the critical inner loop. This begins to hint at the reason why {{Not a typo|ffirst}} is so innovative, and is best illustrated by memcpy or strcpy when implemented with standard 128-bit non-predicated {{Not a typo|non-ffirst}} SIMD. For IBM POWER9 the number of hand-optimised instructions to implement strncpy is in excess of 240.<ref>[https://patchwork.ozlabs.org/project/glibc/patch/20200904165653.16202-1-rzinsly@linux.ibm.com/ PATCH to libc6 to add optimised POWER9 strncpy]</ref> By contrast, the same strncpy routine in hand-optimised RVV assembler is a mere 22 instructions.<ref>[https://github.com/riscv/riscv-v-spec/blob/master/example/strncpy.s RVV strncpy example]</ref> The above SIMD example could potentially fault and fail at the end of memory, due to attempts to read too many values: it could also cause significant numbers of page or misaligned faults by similarly crossing over boundaries. In contrast, by allowing the vector architecture the freedom to decide how many elements to load, the first part of a strncpy, if beginning initially on a sub-optimal memory boundary, may return just enough loads such that on ''subsequent'' iterations of the loop the batches of vectorised memory reads are optimally aligned with the underlying caches and virtual memory arrangements. Additionally, the hardware may choose to use the opportunity to end any given loop iteration's memory reads ''exactly'' on a page boundary (avoiding a costly second TLB lookup), with speculative execution preparing the next virtual memory page whilst data is still being processed in the current loop. All of this is determined by the hardware, not the program itself.<ref>[https://alastairreid.github.io/papers/sve-ieee-micro-2017.pdf ARM SVE2 paper by N. Stevens]</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)