Skip to content

Add few ARM DSP Intrinsics #529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 20, 2018
Merged

Add few ARM DSP Intrinsics #529

merged 2 commits into from
Jul 20, 2018

Conversation

paoloteti
Copy link
Contributor

Add few ARM DSP intrinsics:

  • Signed saturating add/sub
  • Saturating four 8-bit integer add/sub
  • Saturating two 8-bit integer add/sub

The intent is mainly to setup the module and to add all the rest in the future.
Listed intrinsics are available on Cortex-M too (+dsp is required on some model except on M4).

For now are limited to v7 only for easy test on CI and also because I can't test on
real hardware (I use/have Cortex-R boards only around)

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 18, 2018

This needs rebasing on top of master since the portable vector types have been removed. There is an internal coresimd/simd.rs module that you can use.

@gnzlbg gnzlbg self-requested a review July 18, 2018 23:52
- Signed saturating add/sub
- Saturating four 8-bit integer add/sub
- Saturating two 8-bit integer add/sub

The intent is mainly to setup the module and to add all
the rest in the future.

Listed intrinsics are available on Cortex-M too (+dsp is required
on some model except for M4).
Rebase everything on top of master since the portable vector types
have been removed.
@alexcrichton alexcrichton merged commit b9de11a into rust-lang:master Jul 20, 2018
@alexcrichton
Copy link
Member

Thanks!

@paoloteti paoloteti deleted the arm-dsp branch July 20, 2018 18:30
@japaric
Copy link
Member

japaric commented Jul 23, 2018

I think these intrinsics are not available on ARMv7-M (e.g. thumbv7m-none-eabi); they are only available on ARMv7E-M (e.g. thumbv7em-none-eabi). I tested with qadd8 and I get an LLVM error on ARMv7-M:

#![no_std]
#![feature(stdsimd)]

use core::arch::arm;
use core::ptr;

#[no_mangle]
pub unsafe fn foo() {
    let x = ptr::read_volatile(0x2000_0000 as *const _);
    let y = ptr::read_volatile(0x2000_0100 as *const _);
    let z = arm::qadd8(x, y);
    ptr::write_volatile(0x2000_0200 as *mut _, z)
}
$ cargo build --target thumbv7m-none-eabi
LLVM ERROR: Cannot select: intrinsic %llvm.arm.qadd8

$ cargo build --target thumbv7em-none-eabi && echo OK
OK

I believe these intrinsics need to be behind a #[cfg(target_feature = "dsp")], but the "dsp" target feature needs to be whitelisted first.

@paoloteti
Copy link
Contributor Author

I agree, but I'm wondering why CI was not reporting this error on qadd8. I had the same error on CI on arm_sel that is now behind not mclass (waiting for +dsp), but not on qadd8.

@japaric
Copy link
Member

japaric commented Jul 23, 2018

Not sure. I'm looking at the rlib that comes in the rust-std component for thumbv7m-none-eabi I don't see the (machine code of the) qadd8 / arm_qadd function. It seems the function didn't get codegen -- this is usually the case for generic and #[inline(always)] functions -- instead it got stored as metadata in the rlib.

Executables are more reliable than rlibs for testing codegen -- provided that the compiler didn't opt away everything you are interested in.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 23, 2018

It seems the function didn't get codegen

This is correct, the reason is that they are #[inline].

Since we cannot compile tests, what we need is a separate crate (we can add it as a crate in the workspace), that tries to use these functions. An rlib with non-#[inline] wrappers over these intrinsics would be enough as long as these wrappers are exported (make them pub).

We should be doing this for all functions in coresimd, so maybe we should find an automated way of generating this crate, and trying to compile it. cc @alexcrichton ? We basically just need to generate a wrapper for every single function in coresimd after macro expansion has removed the functions disabled by cfg(target_feature = "...") for a given target.

@paoloteti
Copy link
Contributor Author

paoloteti commented Jul 23, 2018

@japaric may be +dsp is not enough here and in general for M3 devices (ARMv7-M).

+dsp enable/disable QADD/QSUB and few others (smlabb etc) where use of Q flag is mandatory.

In the C world +dsp set __ARM_FEATURE_DSP, instead qadd8 and similar are guarded by __ARM_FEATURE_SIMD32 that for what I known depends on -mcpu=... only.

I'm sure in GCC We have (may be it is the same on LLVM):

Cortex-M3

-mcpu=cortex-m3
#define __ARM_ARCH_7M__ 1

Cortex-M4

-mcpu=cortex-m4
#define __ARM_ARCH_7EM__ 1
#define __ARM_FEATURE_DSP 1
#define __ARM_FEATURE_SIMD32 1

Cortex-R4

-mcpu=cortex-r4
#define ___ARM_ARCH 7 
#define __ARM_FEATURE_DSP 1
#define __ARM_FEATURE_SIMD32 1

So I guess We need to know the 'E' in v7E.

BTW I'm in the Cortex-R world where We don't have this "problem". So correct me if I'm wrong.

@alexcrichton
Copy link
Member

@gnzlbg

We basically just need to generate a wrapper for every single function in coresimd

I think this is what assert_instr does, right? So long as all functions have an #[assert_instr] annotation we should be guaranteed to compile them during tests. (so long as the cfg's work out)

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 23, 2018

Yeah.

The problem with the "run-time" tests is that these targets (thumb*-...) are not supported by the test crate. But that means that we just have to get assert_instr to generate the "static tests" in some other way, e.g., if one passes --cfg STDSIMD_STATIC_TESTS.

@alexcrichton
Copy link
Member

Hm I'm not sure I quite understand, do you mean that we basically just don't run tests for the thumb targets? (and therefore we're not hitting the LLVM assertion?) If so, perhaps we could add the thumb targets to CI?

@japaric
Copy link
Member

japaric commented Jul 24, 2018

@alexcrichton

If so, perhaps we could add the thumb targets to CI?

We have the thumbs targets in CI but we only run cargo build for them.

The assert_instr macro only expands when running cargo test but we don't cargo test the thumb targets because they don't support the test crate or cargo test.

The expansion of the assert_instr macro also depends on crates like backtrace which themselves depend on std and the thumb targets don't have std support so those crates don't support the thumb targets.

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 24, 2018

@japaric I talked with @alexcrichton on IRC about this, and the easiest path forward would probably be to just have a macro that, when a --cfg flag is used, expands to a function that uses the intrinsic. For example, given:

#[inline]
#[target_feature(enable = "mclass")]
#[static_test]
pub unsafe fn foo(a: i32, b: i32) -> i32 {...}

the static_test proc macro could, for example, add:

#[cfg(stdsimd_static_test)]
#[inline(never)]
#[target_feature(enable = "mclass")]
pub unsafe fn __static_test_foo_forward(a: i32, b: i32) -> i32 {
    foo(a, b)
}
#[cfg(stdsimd_static_test)]
#[inline(never)]
#[target_feature(enable = "mclass")]
pub unsafe fn __static_test_foo_default() -> i32 {
    foo(Default::default(), Default::default())
}

Then doing a cargo build with --cfg stdsimd_static_test should be enough to catch these issues.

In the discussion with @alexcrichton on IRC we were talking about getting assert_instr do this. Taking a closer look, all code that uses assert_instr is actually guarded with #[cfg_attr(test, assert_instr(...)], so we would either need to change that to just #[assert_instr(...)] and handle the #[cfg(test)] internally (doing one thing, or the other, or both, depending on --cfg stdsimd_static_test), or just use a different macro.

Looking into assert_instr implementation, I think I would find it cleaner to have a different macro for this. Once we have #[static_test], we can add a second proc macro, e.g. #![static_test_all], that just applies #[static_test] to every pub fn item in a module, for example, so that we can easily add this to the x86 modules as well. This is something that we should always be testing anyways.

@alexcrichton
Copy link
Member

Ah to expand a bit on my thoughts, I'd actually prefer to reuse the existing #[assert_instr] attribute but basically "get it working" for thumb targets in terms of compiling a test function and actually codegen'ing the various intrinsics. I think we just need to do the legwork to get things like backtrace compiled out on thumb, right?

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 24, 2018

@alexcrichton currently backtrace requires std, so the first step would be to be able to use backtrace with core only. I actually needed this for another project of mine and it wasn't a trivial change: backtrace needs to know stuff about the platform its running on, but core doesn't provide this information...

We would also need the test crate to work with core, and I don't even know if this is possible: the test crate needs to print stuff to the screen, but you can't do that with core, and the thumb* targets don't even have libc...

@alexcrichton
Copy link
Member

Oh I mean to just flat out exclude backtrace from the build (there's no hope for it on thumb)

@alexcrichton
Copy link
Member

(and similarly for other crates, I think there's a test replacement?)

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 24, 2018

And move all the test stuff inside the assert_instr macro? That is, instead of #[cfg_attr(test, assert_instr(...))], just move all invokations to #[assert_instr(...)] and handle the test stuff internally ?

@alexcrichton
Copy link
Member

Oh no I still think we should only compile all this in during testing, we can't touch the main build much because it has to be include-able as a submodule into rust-lang/rust

@gnzlbg
Copy link
Contributor

gnzlbg commented Jul 25, 2018

I see, then we need to make sure that all the dev-dependencies that we have compile without std on these targets (e.g. it would be ok if they just "do nothing" in that case). And then we have to get cargo test to somehow work on this targets as well: e.g. compiling with --cfg=test but not actually running any tests or something like that.

@alexcrichton
Copy link
Member

Indeed yeah! That's what I was thinking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants