ctypes mixed-types bitfield layout nonsensical; doesn't match compiler. #59324

mdickinson · 2012-06-21T09:13:35Z

BPO	15119
Nosy	@mdickinson, @meadori
Files	ctypes_mixed_bitfields.patch ctypes_bitfields.py: Random tests comparing ctypes bitfield layout with gcc4.2 / x64 bitfield layout

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2012-06-21.09:13:35.059>
labels = ['ctypes', 'type-bug']
title = "ctypes mixed-types bitfield layout nonsensical; doesn't match compiler."
updated_at = <Date 2014-07-16.19:30:20.597>
user = 'https://github.com/mdickinson'

bugs.python.org fields:

activity = <Date 2014-07-16.19:30:20.597>
actor = 'wbu'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['ctypes']
creation = <Date 2012-06-21.09:13:35.059>
creator = 'mark.dickinson'
dependencies = []
files = ['26072', '26084']
hgrepos = []
issue_num = 15119
keywords = ['patch']
message_count = 9.0
messages = ['163315', '163319', '163353', '163354', '163417', '163418', '163419', '223251', '223256']
nosy_count = 3.0
nosy_names = ['mark.dickinson', 'meador.inge', 'wbu']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue15119'
versions = ['Python 2.7', 'Python 3.3']

mdickinson · 2012-06-21T09:13:34Z

It looks as though there's a bug in the ctypes bitfield layout algorithm. After:

>>> from ctypes import Structure, c_int, c_short
>>> class BITS(Structure):
...     _fields_ = [("A", c_int, 17), ("M", c_short, 1)]
...

I get:

>>> BITS.M
<Field type=c_short, ofs=2:17, bits=1>

which doesn't make a lot of sense (17th bit of a short?) This causes a negative shift operation when trying to access the .M field of an instance of this structure (see bpo-9530 and in particular msg163303).

On this machine (OS X 10.6, 64-bit build of Python using the system gcc (4.2) with no special compiler flags), the corresponding struct in a simple C test program has size 4:

    #include <stdio.h>

    struct {
      int A : 17;
      short B: 1;
    } flags;

    int main(void) {
      printf("sizeof flags is: %ld\n", sizeof(flags));
      return 0;
    }

So it looks like everything gets packed into that first int. At a guess, BITS.M should therefore look like <Field type=c_int, ofs=0:17, bits=1> instead.

System info:

Python 3.3.0a4+ (default:2035c5ad4239+, Jun 21 2012, 08:30:36)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

mdickinson · 2012-06-21T10:17:35Z

At a guess, BITS.M should therefore look like <Field type=c_int,
ofs=0:17, bits=1> instead.

Refined guess: it should be <Field type=c_short, ofs=2:1, bits=1>.

Tests for this issue should also cover cases like:

    _fields_ = [("A", c_int, 13), ("M", c_short, 5)]

where M should end up being described as <Field type=c_short, ofs=2:0, bits=5>.

mdickinson · 2012-06-21T17:30:03Z

There are two separate issues here. The first is that the layout that ctypes chooses for a struct of bitfields fails basic sanity checks, like having each bitfield actually fit in the corresponding type. As a result, the C-level bitshifting code used to get bitfields ends up invoking undefined behaviour.

A secondary problem is that the ctypes layout doesn't match what the compiler does, at least for the system supplied gcc (4.2) on OS X 10.6.

The attached patch fixes the first issue, but not the second.

meadori · 2012-06-21T17:40:32Z

Thanks for digging into this Mark. I will have a look too later in the day.

meadori · 2012-06-22T14:46:16Z

> At a guess, BITS.M should therefore look like <Field type=c_int,
> ofs=0:17, bits=1> instead.

Refined guess: it should be <Field type=c_short, ofs=2:1, bits=1>.

This refined guess seems reasonable. Although, bitfield allocation order for GCC is dependent on the target ABI. What you have above is at least consistent with the System V i386 [1] and x86-64 [2] psABIs. Not sure about others (other targets and MSVC++ related ones).

I tested the original test case plus the cases listed in the i386 psABI, all which seem to work. I did notice that this doesn't seem to be right for big-endian machines:

>>> from ctypes import *
>>> class S(BigEndianStructure):
...     _fields_ = [("A", c_int, 17), ("B", c_short, 1)]
... 
>>> class T(LittleEndianStructure):
...     _fields_ = [("A", c_int, 17), ("B", c_short, 1)]
... 
>>> s = S()
>>> s.B = 1
>>> s.B
-1
>>> t = T()
>>> t.B = 1
>>> t.B
0

The current implementation got the expected answer of -1 for 't.B' (although that is actually incorrect anyway because bitfields should never be treated as signed).

So some big-endian tests and some tests that check the values stored in the fields will be useful.

Finally, I think proposed allocation seems correct, but I must admit I am not clever enough to follow why the following part works :-)

+ /* Adjust current bit offset if necessary so that the next field
+ doesn't straddle a multiple of 8*dict->size. */
+ if (*pbitofs && (
+ (*pbitofs + bitsize - 1) % (8*dict->size) !=
+ bitsize + (*pbitofs - 1) % (8*dict->size)))
+ *pbitofs += (8*dict->size) - 1 - (*pbitofs - 1) % (8*dict->size);

[1] http://www.uclibc.org/docs/psABI-i386.pdf
[2] http://www.x86-64.org/documentation/abi.pdf

mdickinson · 2012-06-22T14:55:20Z

Finally, I think proposed allocation seems correct, but I must admit I
am not clever enough to follow why the following part works :-)

Nor am I, any more, though it made sense when I wrote it. I'll see if I can make that a bit more readable.

I see two goals here: (1) make the allocation sane and self-consistent, and also ideally document the algorithm ctypes uses (I know there's another issue already open for this), and (2) make the allocation match common compilers. (2) may not be easy / possible...

I did some random testing on my machine (x64, OS X, gcc 4.2), which seems to show that the bitfield allocation algorithm for gcc works roughly like this:

    def simulated_layout(flags):
        bitpos = 0
        for ctype, width in flags:
            if width is None:
                # Plain old integer field (not a bitfield)
                width = 8 * sizeof(ctype)
            space = -bitpos % (8 * sizeof(ctype))
            if width > space:
                bitpos += space
            offset, start = divmod(bitpos, 8 * sizeof(ctype))
            yield offset * sizeof(ctype), start, width
            bitpos += width

At least, my simple and limited random tests have yet to discover a case where this differs from what gcc actually does on my machine, while they're pretty quick to find differences between what gcc does and what ctypes does. I've attached the script in case it's of interest (please don't judge too harshly---it was written quickly and the style leaves something to be desired). In particular, I didn't include signed integers in the tests; sounds like that's a potential complicating factor.

mdickinson · 2012-06-22T14:59:22Z

this doesn't seem to be right for big-endian machines

Right; I didn't pay too much attention to the big-endian case; definitely there should be lots of tests there, so that at least the buildbots have a chance of picking up problems. (Do we currently *have* any big-endian buildbots? I see one Sparc and one PPC machine, but it looks like they're both currently offline.)

wbu · 2014-07-16T18:52:42Z

I just run into this issue, so i'll bump it with another test case:

import ctypes

class Struct(ctypes.Structure):
    _fields_ = [
        ("uint8_0", ctypes.c_uint8, 8),
        ("uint8_1", ctypes.c_uint8, 8),
        ("uint16_0", ctypes.c_uint16, 1),
        ("uint16_1", ctypes.c_uint16, 15),
    ]

for f in Struct._fields_:
    print f[0], getattr(Struct, f[0])

python bitfield.py
uint8_0 <Field type=c_ubyte, ofs=0:0, bits=8>
uint8_1 <Field type=c_ubyte, ofs=1:0, bits=8>
uint16_0 <Field type=c_ushort, ofs=1:8, bits=1>
uint16_1 <Field type=c_ushort, ofs=4:0, bits=15>

Originally tested with Python 2.7.3, but also confirmed with later versions.

Is there any workaround by specifying ofs and bits manually?

wbu · 2014-07-16T19:30:21Z

Answering my own question, here is a workaround, that also produces reasonable results for the original test case. Basically just inserting an empty struct:

import ctypes

class Empty(ctypes.Structure):
    _fields_ = []

class Struct(ctypes.Structure):
    _fields_ = [
        ("uint8_0", ctypes.c_uint8, 8),
        ("uint8_1", ctypes.c_uint8, 8),
        ("_ignore", Empty),
        ("uint16_0", ctypes.c_uint16, 1),
        ("uint16_1", ctypes.c_uint16, 15),
    ]

for f in Struct._fields_:
    print f[0], getattr(Struct, f[0])

python bitfield.py
uint8_0 <Field type=c_ubyte, ofs=0:0, bits=8>
uint8_1 <Field type=c_ubyte, ofs=1:0, bits=8>
_ignore <Field type=Empty, ofs=2, size=0>
uint16_0 <Field type=c_ushort, ofs=2:0, bits=1>
uint16_1 <Field type=c_ushort, ofs=2:1, bits=15>

matthiasgoergens · 2022-10-09T05:47:21Z

Thanks for the test cases! (I'm writing my random test cases with Hypothesis, btw. That one has shrinking, too, to give you the smallest possible counter example.)

I had to fix a few minor problems with your code. (Eg the random C code you produce doesn't zero memory, etc.)

My PR at #97702 passes your tests!

mdickinson added topic-ctypes type-bug An unexpected behavior, bug, or error labels Jun 21, 2012

mdickinson changed the title ~~Bug in ctypes bitfield layout?~~ ctypes mixed-types bitfield layout nonsensical; doesn't match compiler. Jun 21, 2012

ezio-melotti transferred this issue from another repository Apr 10, 2022

serhiy-storchaka added this to Locale issues 🗺 May 21, 2022

serhiy-storchaka removed this from Locale issues 🗺 May 21, 2022

mdickinson mentioned this issue Sep 25, 2022

gh-96821: Fix undefined behaviour in _ctypes/cfield.c #96925

Open

mdickinson mentioned this issue Oct 8, 2022

ctypes: bit field data does not survive round trip #97588

Closed

matthiasgoergens mentioned this issue Oct 9, 2022

gh-97588: Fix ctypes structs #97702

Merged

matthiasgoergens mentioned this issue Mar 17, 2023

Backport 'Fix ctypes structs' (97702) to 3.11 #102774

Closed

DakshKK mentioned this issue Mar 20, 2023

ctypes.Structure does not set value of fields #102844

Closed

encukou added a commit to encukou/cpython that referenced this issue Sep 20, 2024

Add examples from pythongh-59324

2298704

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ctypes mixed-types bitfield layout nonsensical; doesn't match compiler. #59324

ctypes mixed-types bitfield layout nonsensical; doesn't match compiler. #59324

mdickinson commented Jun 21, 2012

mdickinson commented Jun 21, 2012

mdickinson commented Jun 21, 2012

mdickinson commented Jun 21, 2012

meadori commented Jun 21, 2012

meadori commented Jun 22, 2012

mdickinson commented Jun 22, 2012

mdickinson commented Jun 22, 2012

wbu mannequin commented Jul 16, 2014

wbu mannequin commented Jul 16, 2014

matthiasgoergens commented Oct 9, 2022 •

edited

Loading

ctypes mixed-types bitfield layout nonsensical; doesn't match compiler. #59324

ctypes mixed-types bitfield layout nonsensical; doesn't match compiler. #59324

Comments

mdickinson commented Jun 21, 2012

mdickinson commented Jun 21, 2012

mdickinson commented Jun 21, 2012

mdickinson commented Jun 21, 2012

meadori commented Jun 21, 2012

meadori commented Jun 22, 2012

mdickinson commented Jun 22, 2012

mdickinson commented Jun 22, 2012

wbu mannequin commented Jul 16, 2014

wbu mannequin commented Jul 16, 2014

matthiasgoergens commented Oct 9, 2022 • edited Loading

matthiasgoergens commented Oct 9, 2022 •

edited

Loading