Tag Archives: FuseSOC

Building for the Parallella with FuseSOC

I am currently working in the background on porting the FuseSOC build tool to the Parallella.

I will be reporting updates in the thread here, but I wanted somewhere to keep an up-to-date set of instructions, which will be updated each time I refine this.

Current status:

Currently, you can use FuseSOC to build the parallella_7020_headless example project from the parallella-hw repository, with a few hacks, and a new repository with a config for the Parallella, and a two extra scripts (knocking these off, one at a time).

I have now tested the output on target (read on below), and verified I can still talk to the Epiphany, but for now you use the output of this tool at your own risk. Before too long, I hope to be building all my projects using FuseSOC, and will be able to remove the above warning 🙂

Desired result:

The goal is to be able to fetch straight from the parallella-hw repo (update: achievement unlocked), and build for other configurations (headless/HDMI, z7020/z7010) with minimal changes. And then to provide an easy way build (all) your own projects on top of this. I envisage this would be done by patching the base project with your changes to those files – plus you would pull in any new files added either by adding to this repo, or provisioning from your own repo (say if you want to pull in a custom core). However managing the patches is going to be an interesting challenge (ideally you don’t want to commit for every build), but I’ll flesh those ideas out as the port improves.

Installating FuseSOC:

Fetch the repositories from github:

git clone https://github.com/yanidubin/fusesoc
git clone https://github.com/yanidubin/parallella-cores

Go to the fusesoc folder and build, install and configure:

cd fusesoc
autoreconf -i && ./configure && make && sudo make install
fusesoc init

When you are prompted on the location of orpsoc-cores, you can go ahead and install this to a convenient location – this installs the OpenRISC cores (which I am not using as of yet).

Starting a build:

NOTE – at present you will see seven warnings about system files not being found (in bold yellow at the very start of the build). You can ignore these, it is a temporary issue with one of the workarounds I have implemented, and the build will still succeed. As for the various other ERROR type messages seen during the build, these are not actually fatal errors – just some rough edges around how I am driving the Xilinx cli tools.

Create a test folder (anywhere), import Xilinx environment settings, and start the build.

mkdir test
cd test
. /opt/Xilinx/14.7/ISE_DS/settings64.sh
/usr/bin/time -v fusesoc --cores-root=../parallella-cores build parallella

The build shouldn’t take too much longer than planAhead reports (if you add synth+impl). It takes a little longer since it has to run synthesis twice due to some current limitations).

Process "Generate Programming File" completed successfully
INFO:TclTasksC:1850 - process run : Generate Programming File is done.

	Command being timed: "fusesoc build parallella"
	User time (seconds): 428.61
	System time (seconds): 4.04
	Percent of CPU this job got: 94%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 7:37.23

Note that if you do not wish to specify the path to parallella-cores on the commandline as above, you may wish to edit ~/.config/fusesoc/fusesoc.conf manually and add the path to parallella-cores after orpsoc-cores, for example:


$ cat ~/.config/fusesoc/fusesoc.conf
[main]
cores_root = /stash/files/dev/embedded/fpga/fusesoc-cores /stash/files/dev/embedded/fpga/parallella-cores

Testing the build:

The resulting .bit file is at build/parallella/bld-ise/parallella_z7_top.bit. I will be adding automatic conversion to a .bit.bin very soon, but for now I just use.

$ promgen -b -w -p bin -data_width 32 -u 0 build/parallella/bld-ise/parallella_z7_top.bit -o FuseSOC-test-001.bit.bin
$ scp FuseSOC-test-001.bit.bin linaro@Parallella:

So far I have loaded this onto my board and verified that it programs and that I can still talk to the Epiphany chip

linaro-nano:~> sudo su
root@linaro-nano:/home/linaro# cat FuseSOC-test-001.bit.bin > /dev/xdevcfg
root@linaro-nano:/home/linaro# cat /sys/devices/amba.1/f8007000.devcfg/prog_done
1
root@linaro-nano:/home/linaro# exit
linaro-nano:~> cd ~/epiphany-examples/apps/matmul-16
linaro-nano:~/epiphany-examples/apps/matmul-16> ./run.sh

and the test passes – so I at least got something right.

I am not aware of any issues at this time, however my testing is far from extensive.

Build tool validation – work required

The images are completely different – I did a hexdump / diff of the FuseSOC output versus the parallella_e16_headless_gpiose_7020.bit.bin which I expect comprises the same source inputs, built from planAhead.

Ideally, if I had produced something very similar, since I am theoretically using the same tools, I could prove that the build system is a perfect drop in replacement for planAhead. We just might see a different timestamp, or some such, but otherwise binary identical.

I will do some further tests by comparing some of the intermediary files at some point, so others can be confident it is safe to use – but this is not all that important for me at this stage – so long as it runs.

Why do the bitstreams differ?

Olof (author of FuseSOC) has confirmed in his comment below that this is to be expected, and not at all surprising.

So the following observations would seem to be correct – read on only if interested.

My work has recently involved binary patching of mobile radios in the field (including the FPGA hosting our soft processor, the soft processor firmware, and a DSP), and so I have looked into how much impact a tiny change in the VHDL has on a binary bitstream. We must limit the amount of patch data to send over a very low speed radio link.

I found that bzdiff (modified bsdiff4 to use zlib for stream compression rather than bzip2) made the patch file larger than simply using naive zlib compression – not at all the case with our firmware and DSP which benefited from the bsdiff algorithm.

Granted that this was using a much older FPGA from a different vendor (Cyclone, from Altera) – but given that observation, the above does not surprise/worry me. I had conjectured that if a minor maintenance change to our Radio FPGAs can entirely change the bitstream, then something like using different optimisation settings, might cause a vastly different result, as the map/routing might make quite different choices on where to put things – even if a small number of initial changes are introduced.