fpga

Why FPGAs Deserve Your Attention in Machine Vision and Edge AI Applications

As businesses increasingly rely on machine vision to enhance quality, improve productivity, and increase the bottom line, technology providers are relying more on industrial computing solutions that enable faster processing speeds and higher efficiency, or that support new tasks altogether. 




fpga

Make : FPGAs : turning software into hardware with eight fun and easy DIY projects

Location: Engineering Library- TK7895.G36R66 2016




fpga

ASIC and/or FPGA Design & Verification Engineer On Site

El Segundo, CA United States - Job Description At Boeing, we innovate and collaborate to make the world a better place. From the seabed to outer space, you can contribute to work that matters with a company where diversity, equity and inclusion are shared values. We’re committed to fostering an... View





fpga

Locking When Emulating Xtensa LX Multi-Core on a Xilinx FPGA

Today's high-performance computing systems often require the designer to instantiate multiple CPU or DSP cores in their subsystem. However, the performance gained by using multiple CPUs comes with additional programming complexity, especially when accessing shared memory data structures and hardware peripherals. CPU cores need to access shared data in an atomic fashion in a multi-core environment. Locking is the most basic requirement for data sharing. A core takes the lock, accesses the shared data structure, and releases the lock. While one core has the lock, other cores are disallowed from accessing the same data structure. Typically, locking is implemented using an atomic read-modify-write bus transaction on a variable allocated in an uncached memory.

This blog shares the AXI4 locking mechanism when implementing an Xtensa LX-based multi-core system on a Xilinx FPGA platform. It uses a dual-core design mapped to a KC705 platform as an example.

Exclusive Access to Accomplish Locking

The Xtensa AXI4 manager provides atomic access using the AXI4 atomic access mechanism. While Xtensa's AXI manager interface generates an exclusive transaction, the subordinate's interface is also expected to support exclusive access, i.e., AXI monitoring. Xilinx BRAM controller's AXI subordinate interface does not support exclusive access, i.e., AXI monitoring: AXI Feature Adoption in Xilinx FPGAs.

Leveraging Xtensa AXI4 Subordinate Exclusive Access

The Xtensa LX AXI subordinate interface supports exclusive access. One approach is to utilize this support and allocate locks in one of the core's local data memories. Ensure that the number of external exclusive managers is configured, typically to the number of cores (Figure 1).

Figure 1

Note that the Xtensa NX AXI subordinate interface does not support exclusive access. For an Xtensa NX design, shared memory with AXI monitoring is required.

In Figure 2, the AXI_crossbar#2 (block in green) routes core#0's manager AXI access (blue connection) to both core's local memories. Core#1's manager AXI (yellow connection) can also access both core's local memories. Locks can be allocated in either core's local data memory.

In-Bound Access on Subordinate Interface

On inbound access, the Xtensa AXI subordinate interface expects a local memory address, i.e., an external entity needs to present the same address as the core would use to access local memory in its 4GB address space. AXI address remap IP (block in pink) translates the AXI system address to each core's local address. For example, assuming locks are allocated in core#0's local memory, core#1 generates an AXI exclusive to access a lock allocated in core#0's local memory (yellow connection). AXI_crossbar#2 forwards transaction to M03_AXI port (green connection). AXI_address_remap#1 translates the AXI system address to the local memory address before presenting it to core#0's AXI subordinate interface (pink connection).

It is possible to configure cores with disjoint local data memory addresses and avoid the need for an address remap IP block. But then it will be a heterogeneous multi-core design with a multi-image build. An address remap IP is required to keep things simple, i.e., a homogeneous multi-core with a single image build. A single image uses a single memory map. Therefore, both cores must have the same view of a lock, i.e., the lock's AXI bus address must be the same for both.

Figure 2

AXI ID Width

Note Xtensa AXI manager interface ID width=4 bits. Xtensa's AXI subordinate interface ID width=12 bits. So, you must configure AXI crossbar#2 and AXI address remap AXI ID width higher than 4. AXI IDs on a manager port are not globally defined; thus, an AXI crossbar with multiple manager ports will internally prefix the manager port index to the ID and provide this concatenated ID to the subordinate device. On return of the transaction to its manager port of origin, this ID prefix will be used to locate the manager port, and the prefix will be truncated. Therefore, the subordinate port ID is wider in bits than the manager port ID. Figure 3 shows the Xilinx crossbar IP AXI ID width configuration.

Figure 3

Software Tools Support

Cadence tools provide a way to place locks at a specific location. For more details, please refer to Cadence's Linker Support Packages (LSP) Reference Manual for Xtensa SDK. .xtos.lock(green) resides in core#0's local memory and holds user-defined and C library locks. The lock segment memory attribute is defined as shared inner (cyan) so that L32EX and S32EX instructions generate an exclusive transaction on an AXI bus. See Figure 4. The stack and per-core Xtos and C library contexts are allocated in local data memory (yellow).

…………..LSP memory map………….
BEGIN dram0
0x40000000: dataRam : dram0 : 0x8000 : writable ;
dram0_0 : C : 0x40000400 - 0x40007fff : STACK : .dram0.rodata .clib.percpu.data .rtos.percpu.data .dram0.data .clib.percpu.bss .rtos.percpu.bss .dram0.bss;
END dram0
…………………
BEGIN sysViewDataRam0
0xA0100000: system : sysViewDataRam0 : 0x8000 : writable, uncached, shared_inner;
lockRam_0 : C : 0xA0100000 - 0xA01003ff : .xtos.lock;
END sysViewDataRam0
…………..

Figure 4

Please visit the Cadence support site for more information on emulating Xtensa cores on FPGAs.




fpga

Maxeler Technologies Develops Real Time FPGA-based Processing for European XFEL

Developing programmable hardware for European XFEL for real-time line-rate processing on the raw data streams produced by the detector using Maxeler's powerful FPGA solutions.




fpga

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration. (arXiv:2005.03451v1 [cs.LG])

We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.




fpga

Integrating multiple FPGA designs by merging configuration settings

This disclosure relates generally to field-programmable gate arrays (FPGAs). Some implementations relate to methods and systems for transmitting and integrating an intellectual property (IP) block with another user's design. The IP developer can design the IP block to include both a secret portion and a public portion. The IP block developer can send or otherwise provide the IP block to another IP user without disclosing the functional description of the secret portion of the IP block. In some implementations, the IP developer provides the public portion to the IP user at the register-transfer-level (RTL) level, as a hardware description language (HDL)-implemented design, or as a synthesizable netlist. In some implementations, the IP developer provides the secret portion of the IP block to the user in the form of programming bits without providing an HDL, RTL, or netlist implementation of the secret portion.




fpga

Verification module apparatus for debugging software and timing of an embedded processor design that exceeds the capacity of a single FPGA

A plurality of Field Programmable Gate Arrays (FPGA), high performance transceivers, and memory devices provide a verification module for timing and state debugging of electronic circuit designs. Signal value compression circuits and gigabit transceivers embedded in each FPGA increase the fanout of each FPGA. Ethernet communication ports enable remote software debugging of processor instructions.




fpga

Unpatchable 'Starbleed' Bug in FPGA Chips Exposes Critical Devices to Hackers

A newly discovered unpatchable hardware vulnerability in Xilinx programmable logic products could allow an attacker to break bitstream encryption, and clone intellectual property, change the functionality, and even implant hardware Trojans. The details of the attacks against Xilinx 7-Series and Virtex-6 Field Programmable Gate Arrays (FPGAs) have been covered in a paper titled "The




fpga

Microchip’s Low-Power Radiation-Tolerant (RT) PolarFire FPGA Enables High-Bandwidth Space Systems with Lower Total System Cost

Microchip’s Low-Power Radiation-Tolerant (RT) PolarFire FPGA Enables High-Bandwidth Space Systems with Lower Total System Cost




fpga

Microchip Unveils Family Details and Opens Early Access Program for RISC-V Enabled Low-Power PolarFire SoC FPGA Family

Microchip Unveils Family Details and Opens Early Access Program for RISC-V Enabled Low-Power PolarFire SoC FPGA Family




fpga

2019 International Conference on ReConFigurable Computing and FPGAs (ReConFig) [electronic journal].

IEEE / Institute of Electrical and Electronics Engineers Incorporated




fpga

2006 IEEE International Conference on Reconfigurable Computing and FPGA's [electronic journal].

IEEE Computer Society




fpga

2005 International Conference on Reconfigurable Computing and FPGAs ReConFig 2005 [electronic journal].

IEEE Computer Society




fpga

FPGA-based implementation of concatenative speech synthesis algorithm




fpga

Implementation of unmanned vehicle control on FPGA based platform using system generator




fpga

Development of an FPGA based autopilot hardware platform for research and development of autonomous systems




fpga

Study of FPGA implementation of entropy norm computation for IP data streams




fpga

Parallel genetic algorithm engine on an fpga




fpga

Logic synthesis for FPGA-based control units: structural decomposition in logic design / Alexander Barkalov, Larysa Titarenko, Kamil Mielcarek, Sławomir Chmielewski

Online Resource