-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[Clang] Implement the 'counted_by' attribute #76348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': struct bar; struct foo { size_t count; /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have *at least* 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: struct foo *p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar *))); p->count = count + 42; } The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: void use_foo(int index, int val) { p->count += 42; p->array[index] = val; /* The sanitizer can't properly check this access */ } In this example, an update to 'p->count' maintains the relationship requirement: void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; }
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Bill Wendling (bwendling) ChangesThe 'counted_by' attribute is used on flexible array members. The struct bar; This example specifies that the flexible array member 'array' has the struct bar; This establishes a relationship between 'array' and 'count'; In the following, the allocated array erroneously has fewer elements struct foo *p; void foo_alloc(size_t count) { The next example updates 'p->count', breaking the relationship void use_foo(int index, int val) { In this example, an update to 'p->count' maintains the relationship void use_foo(int index, int val) { Patch is 183.31 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76348.diff 20 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index ee211c16a48ac8..cfd89e32c3d0b9 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -202,6 +202,11 @@ C Language Changes
- Enums will now be represented in TBAA metadata using their actual underlying
integer type. Previously they were treated as chars, which meant they could
alias with all other types.
+- Clang now supports the C-only attribute ``counted_by``. When applied to a
+ struct's flexible array member, it points to the struct field that holds the
+ number of elements in the flexible array member. This information can improve
+ the results of the array bound sanitizer and the
+ ``__builtin_dynamic_object_size`` builtin.
C23 Feature Support
^^^^^^^^^^^^^^^^^^^
diff --git a/clang/include/clang/AST/DeclBase.h b/clang/include/clang/AST/DeclBase.h
index 10dcbdb262d84e..5b1038582bc674 100644
--- a/clang/include/clang/AST/DeclBase.h
+++ b/clang/include/clang/AST/DeclBase.h
@@ -19,6 +19,7 @@
#include "clang/AST/SelectorLocationsKind.h"
#include "clang/Basic/IdentifierTable.h"
#include "clang/Basic/LLVM.h"
+#include "clang/Basic/LangOptions.h"
#include "clang/Basic/SourceLocation.h"
#include "clang/Basic/Specifiers.h"
#include "llvm/ADT/ArrayRef.h"
@@ -488,6 +489,15 @@ class alignas(8) Decl {
// Return true if this is a FileContext Decl.
bool isFileContextDecl() const;
+ /// Whether it resembles a flexible array member. This is a static member
+ /// because we want to be able to call it with a nullptr. That allows us to
+ /// perform non-Decl specific checks based on the object's type and strict
+ /// flex array level.
+ static bool isFlexibleArrayMemberLike(
+ ASTContext &Context, const Decl *D, QualType Ty,
+ LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel,
+ bool IgnoreTemplateOrMacroSubstitution);
+
ASTContext &getASTContext() const LLVM_READONLY;
/// Helper to get the language options from the ASTContext.
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index db17211747b17d..f8f55ef513d07e 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -4349,3 +4349,21 @@ def CodeAlign: StmtAttr {
static constexpr int MaximumAlignment = 4096;
}];
}
+
+def CountedBy : InheritableAttr {
+ let Spellings = [Clang<"counted_by">];
+ let Subjects = SubjectList<[Field]>;
+ let Args = [IdentifierArgument<"CountedByField">];
+ let Documentation = [CountedByDocs];
+ let LangOpts = [COnly];
+ // FIXME: This is ugly. Let using a DeclArgument would be nice, but a Decl
+ // isn't yet available due to the fact that we're still parsing the
+ // structure. Maybe that code could be changed sometime in the future.
+ code AdditionalMembers = [{
+ private:
+ SourceRange CountedByFieldLoc;
+ public:
+ SourceRange getCountedByFieldLoc() const { return CountedByFieldLoc; }
+ void setCountedByFieldLoc(SourceRange Loc) { CountedByFieldLoc = Loc; }
+ }];
+}
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 98a7ecc7fd7df3..a795adc770f061 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -7722,3 +7722,81 @@ Both coroutines and coroutine wrappers are part of this analysis.
.. _`CRT`: https://clang.llvm.org/docs/AttributeReference.html#coro-return-type
}];
}
+
+def CountedByDocs : Documentation {
+ let Category = DocCatField;
+ let Content = [{
+Clang supports the ``counted_by`` attribute on the flexible array member of a
+structure in C. The argument for the attribute is the name of a field member
+holding the count of elements in the flexible array. This information can be
+used to improve the results of the array bound sanitizer and the
+``__builtin_dynamic_object_size`` builtin. The ``count`` field member must be
+within the same non-anonymous, enclosing struct as the flexible array member.
+
+This example specifies that the flexible array member ``array`` has the number
+of elements allocated for it in ``count``:
+
+.. code-block:: c
+
+ struct bar;
+
+ struct foo {
+ size_t count;
+ char other;
+ struct bar *array[] __attribute__((counted_by(count)));
+ };
+
+This establishes a relationship between ``array`` and ``count``. Specifically,
+``array`` must have at least ``count`` number of elements available. It's the
+user's responsibility to ensure that this relationship is maintained through
+changes to the structure.
+
+In the following example, the allocated array erroneously has fewer elements
+than what's specified by ``p->count``. This would result in an out-of-bounds
+access not being detected.
+
+.. code-block:: c
+
+ #define SIZE_INCR 42
+
+ struct foo *p;
+
+ void foo_alloc(size_t count) {
+ p = malloc(MAX(sizeof(struct foo),
+ offsetof(struct foo, array[0]) + count * sizeof(struct bar *)));
+ p->count = count + SIZE_INCR;
+ }
+
+The next example updates ``p->count``, but breaks the relationship requirement
+that ``p->array`` must have at least ``p->count`` number of elements available:
+
+.. code-block:: c
+
+ #define SIZE_INCR 42
+
+ struct foo *p;
+
+ void foo_alloc(size_t count) {
+ p = malloc(MAX(sizeof(struct foo),
+ offsetof(struct foo, array[0]) + count * sizeof(struct bar *)));
+ p->count = count;
+ }
+
+ void use_foo(int index, int val) {
+ p->count += SIZE_INCR + 1; /* 'count' is now larger than the number of elements of 'array'. */
+ p->array[index] = val; /* The sanitizer can't properly check this access. */
+ }
+
+In this example, an update to ``p->count`` maintains the relationship
+requirement:
+
+.. code-block:: c
+
+ void use_foo(int index, int val) {
+ if (p->count == 0)
+ return;
+ --p->count;
+ p->array[index] = val;
+ }
+ }];
+}
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index aebb7d9b945c33..26b7416b6e5c09 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6441,6 +6441,19 @@ def warn_superclass_variable_sized_type_not_at_end : Warning<
"field %0 can overwrite instance variable %1 with variable sized type %2"
" in superclass %3">, InGroup<ObjCFlexibleArray>;
+def err_flexible_array_count_not_in_same_struct : Error<
+ "'counted_by' field %0 isn't within the same struct as the flexible array">;
+def err_counted_by_attr_not_on_flexible_array_member : Error<
+ "'counted_by' only applies to C99 flexible array members">;
+def err_counted_by_attr_refers_to_flexible_array : Error<
+ "'counted_by' cannot refer to the flexible array %0">;
+def err_counted_by_must_be_in_structure : Error<
+ "field %0 in 'counted_by' not inside structure">;
+def err_flexible_array_counted_by_attr_field_not_integer : Error<
+ "field %0 in 'counted_by' must be a non-boolean integer type">;
+def note_flexible_array_counted_by_attr_field : Note<
+ "field %0 declared here">;
+
let CategoryName = "ARC Semantic Issue" in {
// ARC-mode diagnostics.
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 5e3b57ea33220b..40b8133202e23e 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -4799,6 +4799,8 @@ class Sema final {
bool CheckAlwaysInlineAttr(const Stmt *OrigSt, const Stmt *CurSt,
const AttributeCommonInfo &A);
+ bool CheckCountedByAttr(Scope *Scope, const FieldDecl *FD);
+
/// Adjust the calling convention of a method to be the ABI default if it
/// wasn't specified explicitly. This handles method types formed from
/// function type typedefs and typename template arguments.
@@ -5642,6 +5644,7 @@ class Sema final {
CorrectionCandidateCallback &CCC,
TemplateArgumentListInfo *ExplicitTemplateArgs = nullptr,
ArrayRef<Expr *> Args = std::nullopt,
+ DeclContext *LookupCtx = nullptr,
TypoExpr **Out = nullptr);
DeclResult LookupIvarInObjCMethod(LookupResult &Lookup, Scope *S,
diff --git a/clang/include/clang/Sema/TypoCorrection.h b/clang/include/clang/Sema/TypoCorrection.h
index e0f8d152dbe554..09de164297e7ba 100644
--- a/clang/include/clang/Sema/TypoCorrection.h
+++ b/clang/include/clang/Sema/TypoCorrection.h
@@ -282,7 +282,7 @@ class CorrectionCandidateCallback {
public:
static const unsigned InvalidDistance = TypoCorrection::InvalidDistance;
- explicit CorrectionCandidateCallback(IdentifierInfo *Typo = nullptr,
+ explicit CorrectionCandidateCallback(const IdentifierInfo *Typo = nullptr,
NestedNameSpecifier *TypoNNS = nullptr)
: Typo(Typo), TypoNNS(TypoNNS) {}
@@ -319,7 +319,7 @@ class CorrectionCandidateCallback {
/// this method.
virtual std::unique_ptr<CorrectionCandidateCallback> clone() = 0;
- void setTypoName(IdentifierInfo *II) { Typo = II; }
+ void setTypoName(const IdentifierInfo *II) { Typo = II; }
void setTypoNNS(NestedNameSpecifier *NNS) { TypoNNS = NNS; }
// Flags for context-dependent keywords. WantFunctionLikeCasts is only
@@ -345,13 +345,13 @@ class CorrectionCandidateCallback {
candidate.getCorrectionSpecifier() == TypoNNS;
}
- IdentifierInfo *Typo;
+ const IdentifierInfo *Typo;
NestedNameSpecifier *TypoNNS;
};
class DefaultFilterCCC final : public CorrectionCandidateCallback {
public:
- explicit DefaultFilterCCC(IdentifierInfo *Typo = nullptr,
+ explicit DefaultFilterCCC(const IdentifierInfo *Typo = nullptr,
NestedNameSpecifier *TypoNNS = nullptr)
: CorrectionCandidateCallback(Typo, TypoNNS) {}
@@ -365,6 +365,10 @@ class DefaultFilterCCC final : public CorrectionCandidateCallback {
template <class C>
class DeclFilterCCC final : public CorrectionCandidateCallback {
public:
+ explicit DeclFilterCCC(const IdentifierInfo *Typo = nullptr,
+ NestedNameSpecifier *TypoNNS = nullptr)
+ : CorrectionCandidateCallback(Typo, TypoNNS) {}
+
bool ValidateCandidate(const TypoCorrection &candidate) override {
return candidate.getCorrectionDeclAs<C>();
}
diff --git a/clang/lib/AST/ASTImporter.cpp b/clang/lib/AST/ASTImporter.cpp
index 949310856562cd..571572fe65aecf 100644
--- a/clang/lib/AST/ASTImporter.cpp
+++ b/clang/lib/AST/ASTImporter.cpp
@@ -9021,6 +9021,10 @@ class AttrImporter {
public:
AttrImporter(ASTImporter &I) : Importer(I), NImporter(I) {}
+ // Useful for accessing the imported attribute.
+ template <typename T> T *castAttrAs() { return cast<T>(ToAttr); }
+ template <typename T> const T *castAttrAs() const { return cast<T>(ToAttr); }
+
// Create an "importer" for an attribute parameter.
// Result of the 'value()' of that object is to be passed to the function
// 'importAttr', in the order that is expected by the attribute class.
@@ -9234,6 +9238,15 @@ Expected<Attr *> ASTImporter::Import(const Attr *FromAttr) {
From->args_size());
break;
}
+ case attr::CountedBy: {
+ AI.cloneAttr(FromAttr);
+ const auto *CBA = cast<CountedByAttr>(FromAttr);
+ Expected<SourceRange> SR = Import(CBA->getCountedByFieldLoc()).get();
+ if (!SR)
+ return SR.takeError();
+ AI.castAttrAs<CountedByAttr>()->setCountedByFieldLoc(SR.get());
+ break;
+ }
default: {
// The default branch works for attributes that have no arguments to import.
diff --git a/clang/lib/AST/DeclBase.cpp b/clang/lib/AST/DeclBase.cpp
index 5e03f0223d311c..e4d7169752bc85 100644
--- a/clang/lib/AST/DeclBase.cpp
+++ b/clang/lib/AST/DeclBase.cpp
@@ -29,7 +29,6 @@
#include "clang/AST/Type.h"
#include "clang/Basic/IdentifierTable.h"
#include "clang/Basic/LLVM.h"
-#include "clang/Basic/LangOptions.h"
#include "clang/Basic/Module.h"
#include "clang/Basic/ObjCRuntime.h"
#include "clang/Basic/PartialDiagnostic.h"
@@ -411,6 +410,79 @@ bool Decl::isFileContextDecl() const {
return DC && DC->isFileContext();
}
+bool Decl::isFlexibleArrayMemberLike(
+ ASTContext &Ctx, const Decl *D, QualType Ty,
+ LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel,
+ bool IgnoreTemplateOrMacroSubstitution) {
+ // For compatibility with existing code, we treat arrays of length 0 or
+ // 1 as flexible array members.
+ const auto *CAT = Ctx.getAsConstantArrayType(Ty);
+ if (CAT) {
+ using FAMKind = LangOptions::StrictFlexArraysLevelKind;
+
+ llvm::APInt Size = CAT->getSize();
+ if (StrictFlexArraysLevel == FAMKind::IncompleteOnly)
+ return false;
+
+ // GCC extension, only allowed to represent a FAM.
+ if (Size.isZero())
+ return true;
+
+ if (StrictFlexArraysLevel == FAMKind::ZeroOrIncomplete && Size.uge(1))
+ return false;
+
+ if (StrictFlexArraysLevel == FAMKind::OneZeroOrIncomplete && Size.uge(2))
+ return false;
+ } else if (!Ctx.getAsIncompleteArrayType(Ty)) {
+ return false;
+ }
+
+ if (const auto *OID = dyn_cast_if_present<ObjCIvarDecl>(D))
+ return OID->getNextIvar() == nullptr;
+
+ const auto *FD = dyn_cast_if_present<FieldDecl>(D);
+ if (!FD)
+ return false;
+
+ if (CAT) {
+ // GCC treats an array memeber of a union as an FAM if the size is one or
+ // zero.
+ llvm::APInt Size = CAT->getSize();
+ if (FD->getParent()->isUnion() && (Size.isZero() || Size.isOne()))
+ return true;
+ }
+
+ // Don't consider sizes resulting from macro expansions or template argument
+ // substitution to form C89 tail-padded arrays.
+ if (IgnoreTemplateOrMacroSubstitution) {
+ TypeSourceInfo *TInfo = FD->getTypeSourceInfo();
+ while (TInfo) {
+ TypeLoc TL = TInfo->getTypeLoc();
+
+ // Look through typedefs.
+ if (TypedefTypeLoc TTL = TL.getAsAdjusted<TypedefTypeLoc>()) {
+ const TypedefNameDecl *TDL = TTL.getTypedefNameDecl();
+ TInfo = TDL->getTypeSourceInfo();
+ continue;
+ }
+
+ if (auto CTL = TL.getAs<ConstantArrayTypeLoc>()) {
+ if (const Expr *SizeExpr =
+ dyn_cast_if_present<IntegerLiteral>(CTL.getSizeExpr());
+ !SizeExpr || SizeExpr->getExprLoc().isMacroID())
+ return false;
+ }
+
+ break;
+ }
+ }
+
+ // Test that the field is the last in the structure.
+ RecordDecl::field_iterator FI(
+ DeclContext::decl_iterator(const_cast<FieldDecl *>(FD)));
+ return ++FI == FD->getParent()->field_end();
+}
+
TranslationUnitDecl *Decl::getTranslationUnitDecl() {
if (auto *TUD = dyn_cast<TranslationUnitDecl>(this))
return TUD;
diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp
index a90f92d07f86d2..b125fc676da841 100644
--- a/clang/lib/AST/Expr.cpp
+++ b/clang/lib/AST/Expr.cpp
@@ -205,85 +205,22 @@ bool Expr::isKnownToHaveBooleanValue(bool Semantic) const {
}
bool Expr::isFlexibleArrayMemberLike(
- ASTContext &Context,
+ ASTContext &Ctx,
LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel,
bool IgnoreTemplateOrMacroSubstitution) const {
-
- // For compatibility with existing code, we treat arrays of length 0 or
- // 1 as flexible array members.
- const auto *CAT = Context.getAsConstantArrayType(getType());
- if (CAT) {
- llvm::APInt Size = CAT->getSize();
-
- using FAMKind = LangOptions::StrictFlexArraysLevelKind;
-
- if (StrictFlexArraysLevel == FAMKind::IncompleteOnly)
- return false;
-
- // GCC extension, only allowed to represent a FAM.
- if (Size == 0)
- return true;
-
- if (StrictFlexArraysLevel == FAMKind::ZeroOrIncomplete && Size.uge(1))
- return false;
-
- if (StrictFlexArraysLevel == FAMKind::OneZeroOrIncomplete && Size.uge(2))
- return false;
- } else if (!Context.getAsIncompleteArrayType(getType()))
- return false;
-
const Expr *E = IgnoreParens();
+ const Decl *D = nullptr;
- const NamedDecl *ND = nullptr;
- if (const auto *DRE = dyn_cast<DeclRefExpr>(E))
- ND = DRE->getDecl();
- else if (const auto *ME = dyn_cast<MemberExpr>(E))
- ND = ME->getMemberDecl();
+ if (const auto *ME = dyn_cast<MemberExpr>(E))
+ D = ME->getMemberDecl();
+ else if (const auto *DRE = dyn_cast<DeclRefExpr>(E))
+ D = DRE->getDecl();
else if (const auto *IRE = dyn_cast<ObjCIvarRefExpr>(E))
- return IRE->getDecl()->getNextIvar() == nullptr;
-
- if (!ND)
- return false;
+ D = IRE->getDecl();
- // A flexible array member must be the last member in the class.
- // FIXME: If the base type of the member expr is not FD->getParent(),
- // this should not be treated as a flexible array member access.
- if (const auto *FD = dyn_cast<FieldDecl>(ND)) {
- // GCC treats an array memeber of a union as an FAM if the size is one or
- // zero.
- if (CAT) {
- llvm::APInt Size = CAT->getSize();
- if (FD->getParent()->isUnion() && (Size.isZero() || Size.isOne()))
- return true;
- }
-
- // Don't consider sizes resulting from macro expansions or template argument
- // substitution to form C89 tail-padded arrays.
- if (IgnoreTemplateOrMacroSubstitution) {
- TypeSourceInfo *TInfo = FD->getTypeSourceInfo();
- while (TInfo) {
- TypeLoc TL = TInfo->getTypeLoc();
- // Look through typedefs.
- if (TypedefTypeLoc TTL = TL.getAsAdjusted<TypedefTypeLoc>()) {
- const TypedefNameDecl *TDL = TTL.getTypedefNameDecl();
- TInfo = TDL->getTypeSourceInfo();
- continue;
- }
- if (ConstantArrayTypeLoc CTL = TL.getAs<ConstantArrayTypeLoc>()) {
- const Expr *SizeExpr = dyn_cast<IntegerLiteral>(CTL.getSizeExpr());
- if (!SizeExpr || SizeExpr->getExprLoc().isMacroID())
- return false;
- }
- break;
- }
- }
-
- RecordDecl::field_iterator FI(
- DeclContext::decl_iterator(const_cast<FieldDecl *>(FD)));
- return ++FI == FD->getParent()->field_end();
- }
-
- return false;
+ return Decl::isFlexibleArrayMemberLike(Ctx, D, E->getType(),
+ StrictFlexArraysLevel,
+ IgnoreTemplateOrMacroSubstitution);
}
const ValueDecl *
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 5081062da2862e..f43c7a0b53d953 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -25,6 +25,7 @@
#include "clang/AST/Attr.h"
#include "clang/AST/Decl.h"
#include "clang/AST/OSLog.h"
+#include "clang/AST/OperationKinds.h"
#include "clang/Basic/TargetBuiltins.h"
#include "clang/Basic/TargetInfo.h"
#include "clang/Basic/TargetOptions.h"
@@ -818,6 +819,189 @@ CodeGenFunction::evaluateOrEmitBuiltinObjectSize(const Expr *E, unsigned Type,
return ConstantInt::get(ResType, ObjectSize, /*isSigned=*/true);
}
+const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField(
+ ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) {
+ const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+ getLangOpts().getStrictFlexArraysLevel();
+ unsigned FieldNo = 0;
+
+ for (const Decl *D : RD->decls()) {
+ if (const auto *Field = dyn_cast<FieldDecl>(D);
+ Field && Decl::isFlexibleArrayMemberLike(
+ Ctx, Field, Field->getType(), StrictFlexArraysLevel,
+ /*IgnoreTemplateOrMacroSubstitution=*/true)) {
+ const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+ Offset += Layout.getFieldOffset(FieldNo);
+ return Field;
+ }
+
+ if (const auto *Record = dyn_cast<RecordDecl>(D))
+ if (const FieldDecl *Field =
+ FindFlexibleArrayMemberField(Ctx, Record, Offset)) {
+ const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD);
+ Offset += Layout.getFieldOffset(FieldNo);
+ return Field;
+ }
+
+ if (isa<FieldDecl>(D))
+ ++FieldNo;
+ }
+
+ return nullptr;
+}
+
+llvm::Value *
+CodeGenFunction::emitFlexibleArrayMemberSize(const Expr *E, unsigned Type,
+ llvm::IntegerType *ResType) {
+ // The code generated here calculates the size of a struct with a flexible
+ // array member that uses the counted_by attribute. There are two instances
+ // we handle:
+ //
+ // struct s {
+ // unsigned long flags;
+ // int count;
+ // int array...
[truncated]
|
Some (hopefully) helpful comments on this admittedly very large patch. General Comment: I pulled back on allowing the
This should align with @rapidsna's bounds checking work and get rid of the "different behaviors for a pointer and non-pointer" that we had in the previous PR (#73730 (comment)). SEMA: The SEMA code is a bit different from the previous PR. Most notable is that it's not using the SEMA "lookup" utility the same way. There's no clear way to prevent lookup from climbing from the current scope up through the top-level scope. That leads to false negatives, duplicate warnings, and other horrors. Also, the All of these combined to make the use of the "lookup" utility very frustrating. There are probably many improvements that could be made to the lookup, but I'd prefer to do those in a follow-up patch. CodeGen: One observation: the There are two ways that the
So that's basically it. I have a program on my machine that runs most of the examples in the Thank you all for taking the time to review this! Share and enjoy! |
Pinging for after-holidays visibility. :-) |
clang/lib/CodeGen/CGBuiltin.cpp
Outdated
const FieldDecl *CodeGenFunction::FindFlexibleArrayMemberField( | ||
ASTContext &Ctx, const RecordDecl *RD, uint64_t &Offset) { | ||
const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel = | ||
getLangOpts().getStrictFlexArraysLevel(); | ||
unsigned FieldNo = 0; | ||
|
||
for (const Decl *D : RD->decls()) { | ||
if (const auto *Field = dyn_cast<FieldDecl>(D); | ||
Field && Decl::isFlexibleArrayMemberLike( | ||
Ctx, Field, Field->getType(), StrictFlexArraysLevel, | ||
/*IgnoreTemplateOrMacroSubstitution=*/true)) { | ||
const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD); | ||
Offset += Layout.getFieldOffset(FieldNo); | ||
return Field; | ||
} | ||
|
||
if (const auto *Record = dyn_cast<RecordDecl>(D)) | ||
if (const FieldDecl *Field = | ||
FindFlexibleArrayMemberField(Ctx, Record, Offset)) { | ||
const ASTRecordLayout &Layout = Ctx.getASTRecordLayout(RD); | ||
Offset += Layout.getFieldOffset(FieldNo); | ||
return Field; | ||
} | ||
|
||
if (isa<FieldDecl>(D)) | ||
++FieldNo; | ||
} | ||
|
||
return nullptr; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FindFieldInTopLevelOrAnonymousStruct
looks somewhat similar. Any chance to reuse code more between the two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, and they're just differently enough to not be too compatible. The best(?) we could do would be to have a generic "deep" iterator through all of the fields in a struct. (As it is, the iterators are only for the surface level.)
In fact, I think there should be methods added to ASTRecordLayout
/ CGRecordLayout
to iterate through more than just the first-level decls. I'd like to do that as a follow-up commit though.
if (Value *V = emitFlexibleArrayMemberSize(E, Type, ResType)) | ||
return V; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should emitFlexibleArrayMemberSize
be made infallible (via asserts rather than early returns of nullptr
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over the weekend, I thought about emitting a -1 / 0
in the cases where we weren't able to generate the BDOS calculation. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments. Looking good overall but LocalDeclMap
doesn't seem to serve the purpose for this PR. And I'm not sure the code block using it would be actually necessary. Please see my inlined review.
…can rely upon assumptions.
… Mark the result as 'signed' after subtracting from it.
Possibly due to bug #72032 , I can get this tree to crash using the latest Specifically:
|
@kees Should be fixed now. |
…le the types ourselves throughout the function.
Thanks! The update fixes the anon struct issue I hit. I've found one more issue, though this appears to be a miscalculation with a pathological
This prints a wrapped calculation instead of the expected "0". |
@kees Shouldn't |
Well, maybe it's not wrapped, but it should absolutely return 0, not -1. The return value of -1 means "I don't know how large this is". In this case we do know (there is an associated |
@kees Oh, I see. I did not know such the convention but it makes sense. Is it documented somewhere? |
This is new territory (having a multiplier for finding size that may be negative), so there's nothing to document it beyond FORTIFY users needing to maintain safe checks. The only safe size to return for "impossible size" is 0 in this case, otherwise a confused state (negative Before |
…e value is unknown. In the negative bounds case that isn't the case.
@kees, the issue should be fixed with the newest push. |
This reverts commit fefdef8. Breaks check-clang, see #76348 (comment) Also revert follow-on "[Clang] Update 'counted_by' documentation" This reverts commit 4a3fb9c.
Reverted in 2dce772 for now. |
The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. For example: ``` struct bar; struct foo { int count; struct inner { struct { int count; /* The 'count' referenced by 'counted_by' */ }; struct { /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; } baz; }; ``` This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': ``` struct bar; struct foo { size_t count; /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; ``` This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have *at least* 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: ``` struct foo *p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar *))); p->count = count + 42; } ``` The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: ``` void use_foo(int index, int val) { p->count += 42; p->array[index] = val; /* The sanitizer can't properly check this access */ } ``` In this example, an update to 'p->count' maintains the relationship requirement: ``` void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; } ```
Resubmitted with fix. Sorry about the failures:
|
The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. For example: ``` struct bar; struct foo { int count; struct inner { struct { int count; /* The 'count' referenced by 'counted_by' */ }; struct { /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; } baz; }; ``` This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': ``` struct bar; struct foo { size_t count; /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; ``` This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have *at least* 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: ``` struct foo *p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar *))); p->count = count + 42; } ``` The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: ``` void use_foo(int index, int val) { p->count += 42; p->array[index] = val; /* The sanitizer can't properly check this access */ } ``` In this example, an update to 'p->count' maintains the relationship requirement: ``` void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; } ```
@bwendling is there any plan / possibility for simple expressions (with no side effects)? Like: struct libusb_bos_dev_capability_descriptor {
uint8_t bLength;
uint8_t bDescriptorType;
uint8_t bDevCapabilityType;
uint8_t dev_capability_data[] __attribute__((counted_by(bLength - 3)));
}; |
Not right now. Apple is in the process of expanding the bounds checking code beyond checking only flexible array members. We will most likely follow their lead on |
@bwendling thanks for your reply. No idea how representative it is of other projects, but 3 of the 6 flexible arrays in libusb structures have these kinds of non-trivial counts. The GCC folks seem to have considered adding such support, but it's not clear what, if anything, they decided. |
Can you point them out?
I know about that. They haven't made a decision on it as far as I know. |
Certainly. Here: libusb/libusb@28394f4 |
@seanm Apple's -fbounds-safety model will allow arithmetic expressions to be used for 'counted_by' attribute: https://github.com/llvm/llvm-project/blob/main/clang/docs/BoundsSafety.rst#external-bounds-annotations We are currently working on extending the 'counted_by' attribute argument to be parsed as an expression: #78000. While this patch will initially restrict the count argument to be "identification" only as it currently works (there's more work to do to actually allow an arbitrary expression), this will definitely help extend the model to support your use cases in the future. |
The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. For example: ``` struct bar; struct foo { int count; struct inner { struct { int count; /* The 'count' referenced by 'counted_by' */ }; struct { /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; } baz; }; ``` This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': ``` struct bar; struct foo { size_t count; /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; ``` This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have *at least* 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: ``` struct foo *p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar *))); p->count = count + 42; } ``` The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: ``` void use_foo(int index, int val) { p->count += 42; p->array[index] = val; /* The sanitizer can't properly check this access */ } ``` In this example, an update to 'p->count' maintains the relationship requirement: ``` void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; } ```
This reverts commit fefdef8. Breaks check-clang, see llvm#76348 (comment) Also revert follow-on "[Clang] Update 'counted_by' documentation" This reverts commit 4a3fb9c.
The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. For example: ``` struct bar; struct foo { int count; struct inner { struct { int count; /* The 'count' referenced by 'counted_by' */ }; struct { /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; } baz; }; ``` This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': ``` struct bar; struct foo { size_t count; /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; ``` This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have *at least* 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: ``` struct foo *p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar *))); p->count = count + 42; } ``` The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: ``` void use_foo(int index, int val) { p->count += 42; p->array[index] = val; /* The sanitizer can't properly check this access */ } ``` In this example, an update to 'p->count' maintains the relationship requirement: ``` void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; } ```
This reverts commit 164f85d.
The 'counted_by' attribute is used on flexible array members. The argument for the attribute is the name of the field member holding the count of elements in the flexible array. This information is used to improve the results of the array bound sanitizer and the '__builtin_dynamic_object_size' builtin. The 'count' field member must be within the same non-anonymous, enclosing struct as the flexible array member. For example: ``` struct bar; struct foo { int count; struct inner { struct { int count; /* The 'count' referenced by 'counted_by' */ }; struct { /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; } baz; }; ``` This example specifies that the flexible array member 'array' has the number of elements allocated for it in 'count': ``` struct bar; struct foo { size_t count; /* ... */ struct bar *array[] __attribute__((counted_by(count))); }; ``` This establishes a relationship between 'array' and 'count'; specifically that 'p->array' must have *at least* 'p->count' number of elements available. It's the user's responsibility to ensure that this relationship is maintained throughout changes to the structure. In the following, the allocated array erroneously has fewer elements than what's specified by 'p->count'. This would result in an out-of-bounds access not not being detected: ``` struct foo *p; void foo_alloc(size_t count) { p = malloc(MAX(sizeof(struct foo), offsetof(struct foo, array[0]) + count * sizeof(struct bar *))); p->count = count + 42; } ``` The next example updates 'p->count', breaking the relationship requirement that 'p->array' must have at least 'p->count' number of elements available: ``` void use_foo(int index, int val) { p->count += 42; p->array[index] = val; /* The sanitizer can't properly check this access */ } ``` In this example, an update to 'p->count' maintains the relationship requirement: ``` void use_foo(int index, int val) { if (p->count == 0) return; --p->count; p->array[index] = val; } ```
The 'counted_by' attribute is used on flexible array members. The
argument for the attribute is the name of the field member holding the
count of elements in the flexible array. This information is used to
improve the results of the array bound sanitizer and the
'__builtin_dynamic_object_size' builtin. The 'count' field member must
be within the same non-anonymous, enclosing struct as the flexible array
member. For example:
This example specifies that the flexible array member 'array' has the
number of elements allocated for it in 'count':
This establishes a relationship between 'array' and 'count';
specifically that 'p->array' must have at least 'p->count' number of
elements available. It's the user's responsibility to ensure that this
relationship is maintained throughout changes to the structure.
In the following, the allocated array erroneously has fewer elements
than what's specified by 'p->count'. This would result in an
out-of-bounds access not not being detected:
The next example updates 'p->count', breaking the relationship
requirement that 'p->array' must have at least 'p->count' number of
elements available:
In this example, an update to 'p->count' maintains the relationship
requirement: