Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Vendoring Issues with Globs and Symlinks #984

Open
wants to merge 93 commits into
base: main
Choose a base branch
from
Open

Conversation

Listener430
Copy link
Collaborator

@Listener430 Listener430 commented Jan 29, 2025

what

Done:

  1. append //. at the end to github repo url (in order to clone entire repo)
  2. Removed any symlinks inside go-getter
  3. Fixed support for double star globs in excluded_paths and included_paths:
    included_paths:
      - "**/{demo-library,demo-stacks}/**/*.{tf,md}"
    excluded_paths:
      - "**/demo-library/**/*.{tfvars,tf}"
    
  4. Added test to vendoring scenario in fixtures
  5. Added depth=1 for all github downloads through custom detector
  6. Breaking change: * now correctly matches a single segment. Anyone using a single star to match multiple segments should change it to **. This should never have matched multiple segments so long as double star was supposed to work.

why

  • double star globs were not correctly matching multiple segments in all cases
  • vendoring without a shallow depth is 2x slower
  • the //. is an esoteric expression to copy all files from the root. Rather than expect users to know this, we default it where it makes sense.

references

Summary by CodeRabbit

  • New Features

    • Enhanced support for Git repository URLs across multiple hosts, with shallow clone and token improvements.
    • Introduced robust file and directory copying mechanisms with advanced pattern matching for vendor file selection.
    • Added a new function to mask user credentials in URLs for improved security.
    • Introduced new configuration files for vendor management and test cases for glob functionality.
  • Bug Fixes

    • Improved error handling and logging to enhance vendor and Git operations.
  • Documentation

    • Added detailed guidance on using glob patterns for flexible vendoring.
  • Tests

    • Expanded tests to verify both inclusion and exclusion of files during vendor operations, including new test cases for glob functionality.

@Listener430 Listener430 added the bugfix Change that restores intended behavior label Jan 29, 2025
@Listener430 Listener430 self-assigned this Jan 29, 2025
@Listener430 Listener430 requested a review from a team as a code owner January 29, 2025 18:30
@Listener430 Listener430 requested a review from osterman January 29, 2025 18:31
Copy link
Contributor

coderabbitai bot commented Jan 29, 2025

📝 Walkthrough

Walkthrough

This pull request introduces several improvements across multiple packages. In the vendor and git getter modules, function signatures now use pointer references for configuration, error handling has been standardized, and URL processing has been refined. A new type for Git operations enhances cloning with symlink removal. Additionally, a comprehensive file copy mechanism is added with advanced glob pattern matching and corresponding tests. New documentation and configuration files support these changes while masking of basic auth credentials is implemented in the utility package. A new dependency and error handling definitions further structure the codebase.

Changes

File(s) Change Summary
internal/exec/go_getter_utils.go Renamed CustomGitHubDetector to CustomGitDetector, added a source field, refactored the Detect method (with logging, scheme enforcement, path normalization, shallow clone, token injection, and subdir adjustment), and introduced CustomGitGetter with a new Get method using symlink removal. Updated function signatures to use *schema.AtmosConfiguration.
internal/exec/vendor_model.go, internal/exec/vendor_component_utils.go, internal/exec/vendor_model_component.go, internal/exec/validate_stacks.go, internal/exec/yaml_func_include.go, internal/exec/vendor_utils.go Updated various vendor functions to pass atmosConfig as a pointer, replaced copyToTarget with copyToTargetWithPatterns, and standardized error handling with improved logging and error wrapping.
internal/exec/copy_glob.go
internal/exec/copy_glob_test.go
Added new file implementing comprehensive file/directory copying with advanced glob pattern matching and error handling, along with a complete suite of tests validating file operations and pattern rules.
tests/cli_test.go Introduced a new FileNotExists field in the Expectation struct and added a helper function verifyFileNotExists to validate the absence of specified files.
tests/test-cases/demo-globs.yaml Added a new test case atmos_vendor_pull_with_globs verifying file inclusion and exclusion rules during vendor pulls.
website/docs/core-concepts/vendor/vendor-manifest.mdx New documentation section “Vendoring with Globs” detailing glob pattern usage, recursive matching, and examples for configuring file inclusion/exclusion.
pkg/utils/url_utils.go Introduced MaskBasicAuth, a function that masks credentials in URLs by replacing the username and password with “xxx”.
tests/fixtures/scenarios/vendor.yaml
tests/fixtures/scenarios/vendor-globs/atmos.yaml
tests/fixtures/scenarios/vendor-globs/vendor.yaml
Added and modified YAML configuration files for vendor scenarios, including minor formatting adjustments and new vendor configurations supporting glob patterns.
go.mod Added new dependency: github.com/agiledragon/gomonkey/v2 v2.13.0.
internal/exec/error.go New file defining error constants and variables (e.g., ErrDownloadPackage, ErrCopyPackage) along with operational constants for timeout, permissions, and formatting to standardize error reporting in the exec package.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant CustomGitGetter
    participant GitProcess
    participant FileSystem

    Client->>CustomGitGetter: Get(dst, url)
    CustomGitGetter->>GitProcess: Execute git clone
    GitProcess-->>CustomGitGetter: Clone complete
    CustomGitGetter->>FileSystem: removeSymlinks(dst)
    FileSystem-->>CustomGitGetter: Symlinks removed
    CustomGitGetter-->>Client: Return success/error
Loading
sequenceDiagram
    participant User
    participant CopyToTarget
    participant CopyDirRecursive
    participant ProcessDirEntry

    User->>CopyToTarget: copyToTargetWithPatterns(sourceDir, targetPath, patterns)
    CopyToTarget->>CopyDirRecursive: Begin recursive directory scan
    CopyDirRecursive->>ProcessDirEntry: Process each directory entry
    ProcessDirEntry-->>CopyDirRecursive: Return include/exclude decision
    CopyDirRecursive-->>CopyToTarget: Directory copying completed
    CopyToTarget-->>User: Return copy operation result
Loading

Possibly related PRs

Warning

There were issues while running some tools. Please review the errors and either fix the tool’s configuration or disable the tool if it’s a critical failure.

🔧 golangci-lint (1.62.2)

Error: can't load config: the Go language version (go1.23) used to build golangci-lint is lower than the targeted Go version (1.24.0)
Failed executing command with error: can't load config: the Go language version (go1.23) used to build golangci-lint is lower than the targeted Go version (1.24.0)

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)
  • We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
    - To enable this feature, set early_access to true under in the settings.
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary or @auto-summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @auto-title anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
internal/exec/go_getter_utils.go (2)

160-168: Eliminate potential duplication with existing logic.

This RegisterCustomGetters seems to mirror the RegisterCustomDetectors pattern. Consider centralizing shared logic or converting both into a unified registration function to reduce redundancy.


207-220: Consider configurable symlink removal.

Automatically stripping symlinks is helpful, but give the user control to skip or confirm. This might be vital for some use cases or compliance requirements.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 255a05a and f9b3348.

📒 Files selected for processing (1)
  • internal/exec/go_getter_utils.go (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: Build (windows-latest, windows)
  • GitHub Check: Analyze (go)
  • GitHub Check: Summary
🔇 Additional comments (3)
internal/exec/go_getter_utils.go (3)

102-109: Check for potential unintended path expansion.

Appending //. whenever the path ends with .git or has only three parts might impact cases where the user wants to download a subdirectory. Confirm that this logic is aligned with user expectations.


188-199: Good approach for overriding the default getters.

Overriding 'git' with a custom implementation is a practical solution. Be sure to keep an eye on future expansions (like s3, gcs) if needed.


222-236: Validate error handling on symlink removal.

If removing a symlink fails due to permission errors or other reasons, you might want more detailed logging or fallback behavior. Evaluate whether to continue, warn, or halt.

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 29, 2025
@osterman
Copy link
Member

Please add a test for this type of vendoring. It can be in the vendoring scenario we already have.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
internal/exec/go_getter_utils.go (1)

202-231: Enhance symlink handling with additional safeguards.

While the implementation is functional, consider these improvements:

  1. Pass AtmosConfig to removeSymlinks instead of using an empty config
  2. Add protection against circular symlinks
  3. Log removed symlink details for audit purposes

Here's a suggested enhancement:

 type CustomGitGetter struct {
     getter.GitGetter
+    AtmosConfig schema.AtmosConfiguration
 }

 func (c *CustomGitGetter) Get(dst string, url *url.URL) error {
     if err := c.GitGetter.Get(dst, url); err != nil {
         return err
     }
-    return removeSymlinks(dst)
+    return removeSymlinks(dst, c.AtmosConfig)
 }

-func removeSymlinks(root string) error {
+func removeSymlinks(root string, config schema.AtmosConfiguration) error {
+    visited := make(map[string]bool)
     return filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
         if err != nil {
             return err
         }
         if info.Mode()&os.ModeSymlink != 0 {
+            // Check for circular symlinks
+            realPath, err := filepath.EvalSymlinks(path)
+            if err != nil {
+                return fmt.Errorf("failed to evaluate symlink %s: %w", path, err)
+            }
+            if visited[realPath] {
+                return fmt.Errorf("circular symlink detected: %s", path)
+            }
+            visited[realPath] = true
+
+            // Log symlink details before removal
+            target, _ := os.Readlink(path)
-            u.LogWarning(schema.AtmosConfiguration{}, fmt.Sprintf("Removing symlink: %s", path))
+            u.LogWarning(config, fmt.Sprintf("Removing symlink: %s -> %s", path, target))
             return os.Remove(path)
         }
         return nil
     })
 }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9b3348 and ed1af3d.

📒 Files selected for processing (1)
  • internal/exec/go_getter_utils.go (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build (windows-latest, windows)
  • GitHub Check: Summary
🔇 Additional comments (4)
internal/exec/go_getter_utils.go (4)

73-73: LGTM! Field addition enhances URL handling capabilities.

The addition of the source field to store the original URL is a clean enhancement.


151-157: LGTM! Clean function signature update.

The modification properly propagates the source URL to the detector while maintaining the existing detector registration logic.


171-194: LGTM! Well-documented changes with secure defaults.

The changes are well-documented and the getters map provides a comprehensive set of protocols. Good security practice to have S3 and GCS getters disabled by default.


103-109: LGTM! Clean implementation of top-level repository detection.

The logic correctly handles bare repository URLs by appending "//." when appropriate. The debug logging provides good visibility into the decision-making process.

Let's verify the URL transformation behavior:

✅ Verification successful

URL handling logic is correctly implemented and follows go-getter conventions! 💪

The code properly handles repository URLs by:

  • Detecting bare repository URLs (without subdirectories)
  • Adding "//." suffix only when needed (for .git URLs or standard org/repo formats)
  • Maintaining compatibility with go-getter's URL format requirements
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Test various GitHub URL patterns to ensure correct handling
echo "Testing URL transformations..."

# Create a test file
cat << 'EOF' > test.go
package main

import (
    "fmt"
    "os"
    u "github.com/cloudposse/atmos/pkg/utils"
)

func main() {
    detector := &CustomGitHubDetector{source: os.Args[1]}
    result, ok, err := detector.Detect(os.Args[1], "")
    fmt.Printf("Input: %s\nOutput: %s\nOK: %v\nErr: %v\n", os.Args[1], result, ok, err)
}
EOF

# Test cases
urls=(
    "github.com/org/repo.git"
    "github.com/org/repo"
    "github.com/org/repo/subdir"
)

for url in "${urls[@]}"; do
    echo "Testing URL: $url"
    go run test.go "$url"
    echo "---"
done

Length of output: 6203


Script:

#!/bin/bash
# Find the URL handling implementation
echo "Searching for URL handling implementation..."
ast-grep --pattern 'type $_ struct {
  source string
  $$$
}'

echo -e "\nSearching for Detect method implementation..."
ast-grep --pattern 'func ($_ $_) Detect($_, $_) ($_, $_, error) {
  $$$
}'

echo -e "\nSearching for URL parsing logic..."
rg -A 5 'strings\.Contains.*"//"' --type go

Length of output: 998

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 30, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
internal/exec/copy_glob.go (1)

168-255: Consider simplifying copyToTargetWithPatterns.

The function logic is clear, but the length and multiple branches make it complex. Splitting out smaller helper functions (e.g., for handling the included vs excluded paths) might improve maintainability and ease of unit testing.

🧰 Tools
🪛 GitHub Check: Build (macos-latest, macos)

[failure] 184-184:
too many arguments in call to u.LogTrace

🪛 GitHub Check: Build (ubuntu-latest, linux)

[failure] 184-184:
too many arguments in call to u.LogTrace

tests/fixtures/scenarios/vendor/vendor.yaml (1)

54-64: Add a test case to validate the newly introduced globs.

The expanded vendor configuration is helpful for controlling which files are included or excluded. Remember to add or update an integration test verifying that only .tf and .md files are included while .tfvars files are excluded for this source.

Would you like help creating a dedicated test that specifically checks the behavior of these globs?

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed1af3d and 5a6789e.

📒 Files selected for processing (4)
  • internal/exec/copy_glob.go (1 hunks)
  • internal/exec/vendor_model.go (1 hunks)
  • tests/fixtures/scenarios/vendor/vendor.yaml (1 hunks)
  • tests/test-cases/demo-stacks.yaml (1 hunks)
🧰 Additional context used
🪛 GitHub Check: Build (macos-latest, macos)
internal/exec/copy_glob.go

[failure] 55-55:
too many arguments in call to u.LogTrace


[failure] 64-64:
too many arguments in call to u.LogTrace


[failure] 67-67:
too many arguments in call to u.LogTrace


[failure] 78-78:
too many arguments in call to u.LogTrace


[failure] 81-81:
too many arguments in call to u.LogTrace


[failure] 87-87:
too many arguments in call to u.LogTrace


[failure] 119-119:
too many arguments in call to u.LogTrace


[failure] 157-157:
too many arguments in call to u.LogTrace


[failure] 162-162:
too many arguments in call to u.LogTrace


[failure] 184-184:
too many arguments in call to u.LogTrace

🪛 GitHub Check: Build (ubuntu-latest, linux)
internal/exec/copy_glob.go

[failure] 55-55:
too many arguments in call to u.LogTrace


[failure] 64-64:
too many arguments in call to u.LogTrace


[failure] 67-67:
too many arguments in call to u.LogTrace


[failure] 78-78:
too many arguments in call to u.LogTrace


[failure] 81-81:
too many arguments in call to u.LogTrace


[failure] 87-87:
too many arguments in call to u.LogTrace


[failure] 119-119:
too many arguments in call to u.LogTrace


[failure] 157-157:
too many arguments in call to u.LogTrace


[failure] 162-162:
too many arguments in call to u.LogTrace


[failure] 184-184:
too many arguments in call to u.LogTrace

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build (windows-latest, windows)
  • GitHub Check: Summary
🔇 Additional comments (2)
internal/exec/vendor_model.go (1)

310-310: Switching to copyToTargetWithPatterns looks good.

This change aligns with the new glob-handling logic. The final code is consistent with the PR’s objective to support inclusion and exclusion patterns.

tests/test-cases/demo-stacks.yaml (1)

196-204: Enhanced Vendoring File Checks Added

The new file existence entries in the "atmos_vendor_pull" test bolster the validation of the vendoring process. They ensure that the expected files from both the demo-library and demo-stacks components are present after cloning and post-symlink removal. This aligns well with the PR objectives addressing glob functionality and symlink handling.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
internal/exec/go_getter_utils.go (2)

202-215: Consider enhancing error context.

While the implementation is solid, consider wrapping the errors with additional context to aid in debugging:

 func (c *CustomGitGetter) Get(dst string, url *url.URL) error {
 	// Normal clone
 	if err := c.GitGetter.Get(dst, url); err != nil {
-		return err
+		return fmt.Errorf("git clone failed for %s: %w", url, err)
 	}
 	// Remove symlinks
-	return removeSymlinks(dst)
+	if err := removeSymlinks(dst); err != nil {
+		return fmt.Errorf("failed to remove symlinks in %s: %w", dst, err)
+	}
+	return nil
 }

219-231: Consider adding path validation and detailed logging.

The symlink removal implementation could be enhanced with additional safety checks and logging:

 func removeSymlinks(root string) error {
+	// Validate root path
+	if _, err := os.Stat(root); err != nil {
+		return fmt.Errorf("invalid root path %s: %w", root, err)
+	}
+
 	return filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
 		if err != nil {
 			return err
 		}
 		if info.Mode()&os.ModeSymlink != 0 {
-			u.LogWarning(schema.AtmosConfiguration{}, fmt.Sprintf("Removing symlink: %s", path))
+			target, err := os.Readlink(path)
+			if err != nil {
+				return fmt.Errorf("failed to read symlink %s: %w", path, err)
+			}
+			u.LogWarning(schema.AtmosConfiguration{}, fmt.Sprintf("Removing symlink: %s -> %s", path, target))
 			// It's a symlink, remove it
-			return os.Remove(path)
+			if err := os.Remove(path); err != nil {
+				return fmt.Errorf("failed to remove symlink %s: %w", path, err)
+			}
 		}
 		return nil
 	})
 }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4fdffa3 and b8d7a5a.

📒 Files selected for processing (2)
  • internal/exec/go_getter_utils.go (4 hunks)
  • internal/exec/vendor_model.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/exec/vendor_model.go
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build (windows-latest, windows)
  • GitHub Check: Summary
🔇 Additional comments (2)
internal/exec/go_getter_utils.go (2)

71-74: LGTM! Smart handling of top-level repository cloning.

The addition of the source field and the logic to append "//." for top-level repositories is a solid improvement. This ensures proper cloning behavior when no subdirectory is specified.

Also applies to: 103-109


151-155: LGTM! Well-documented parameter addition.

The addition of the source parameter and the detailed comments explaining its purpose make the code's intent clear and maintainable.

Also applies to: 171-175

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
internal/exec/copy_glob.go (1)

47-92: ⚠️ Potential issue

Fix LogTrace calls to match the updated function signature.

Based on the pipeline failures and past review comments, the LogTrace calls need to be updated.

Apply this fix to all LogTrace calls in the function:

-    u.LogTrace(atmosConfig, fmt.Sprintf("Error computing relative path for %q: %v", srcPath, err))
+    u.LogTrace(fmt.Sprintf("Error computing relative path for %q: %v", srcPath, err))
🧹 Nitpick comments (4)
internal/exec/copy_glob.go (3)

16-45: Consider handling errors from deferred Close() calls.

While the implementation is solid, it's good practice to handle errors from Close() operations, especially for the destination file where data corruption could occur.

Here's how you could improve the error handling:

 func copyFile(atmosConfig schema.AtmosConfiguration, src, dst string) error {
     sourceFile, err := os.Open(src)
     if err != nil {
         return fmt.Errorf("opening source file %q: %w", src, err)
     }
-    defer sourceFile.Close()
+    defer func() {
+        if cerr := sourceFile.Close(); cerr != nil && err == nil {
+            err = fmt.Errorf("closing source file %q: %w", src, cerr)
+        }
+    }()
 
     if err := os.MkdirAll(filepath.Dir(dst), os.ModePerm); err != nil {
         return fmt.Errorf("creating destination directory for %q: %w", dst, err)
     }
 
     destinationFile, err := os.Create(dst)
     if err != nil {
         return fmt.Errorf("creating destination file %q: %w", dst, err)
     }
-    defer destinationFile.Close()
+    defer func() {
+        if cerr := destinationFile.Close(); cerr != nil && err == nil {
+            err = fmt.Errorf("closing destination file %q: %w", dst, cerr)
+        }
+    }()

139-166: Consider extracting pattern suffixes as constants.

The pattern matching logic is well-implemented, but the magic strings could be made more maintainable.

Consider this improvement:

+const (
+    singleLevelGlob = "/*"
+    recursiveGlob   = "/**"
+)
+
 func getMatchesForPattern(atmosConfig schema.AtmosConfiguration, sourceDir, pattern string) ([]string, error) {
     fullPattern := filepath.Join(sourceDir, pattern)
     matches, err := u.GetGlobMatches(fullPattern)
     if err != nil {
         return nil, fmt.Errorf("error getting glob matches for %q: %w", fullPattern, err)
     }
     if len(matches) == 0 {
-        if strings.HasSuffix(pattern, "/*") {
-            recursivePattern := strings.TrimSuffix(pattern, "/*") + "/**"
+        if strings.HasSuffix(pattern, singleLevelGlob) {
+            recursivePattern := strings.TrimSuffix(pattern, singleLevelGlob) + recursiveGlob

168-255: Consider pre-allocating the map with an estimated size.

The implementation is solid, but the map allocation could be optimized when processing inclusion patterns.

Consider this optimization:

     if len(s.IncludedPaths) > 0 {
-        filesToCopy := make(map[string]struct{})
+        // Pre-allocate map with estimated size based on number of patterns
+        filesToCopy := make(map[string]struct{}, len(s.IncludedPaths)*10)

This pre-allocation helps avoid map resizing during population, potentially improving performance when dealing with many files.

internal/exec/go_getter_utils.go (1)

202-233: Consider adding progress logging for symlink removal.

The symlink removal implementation is solid, but for better observability, consider adding debug logging for the number of symlinks removed.

Here's a suggested enhancement:

 func removeSymlinks(root string) error {
+    removedCount := 0
     err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
         if err != nil {
             return err
         }
         if info.Mode()&os.ModeSymlink != 0 {
-            return os.Remove(path)
+            if err := os.Remove(path); err != nil {
+                return err
+            }
+            removedCount++
             return nil
         }
         return nil
     })
+    if removedCount > 0 {
+        u.LogDebug(fmt.Sprintf("Removed %d symlinks from %s", removedCount, root))
+    }
     return err
 }
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8d7a5a and 06f5f3b.

📒 Files selected for processing (2)
  • internal/exec/copy_glob.go (1 hunks)
  • internal/exec/go_getter_utils.go (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build (windows-latest, windows)
  • GitHub Check: Summary
🔇 Additional comments (5)
internal/exec/copy_glob.go (2)

3-13: LGTM! Well-organized imports with clear purpose.

The imports are appropriately chosen for file operations and include an optimized copy library with a clear explanatory comment.


94-137: LGTM! Excellent symlink handling implementation.

The function properly implements the PR objective of removing symlinks by explicitly skipping them during the copy process. The error handling is comprehensive and the recursive logic is sound.

internal/exec/go_getter_utils.go (3)

71-74: LGTM! Clean struct extension.

The addition of the source field to CustomGitHubDetector is a well-structured change that maintains backward compatibility.


171-194: LGTM! Well-documented client configuration.

The client configuration is clear and the comments explaining the source parameter's importance are helpful.


219-233: Consider adding test coverage for symlink removal.

The symlink removal functionality would benefit from unit tests to verify its behavior.

Would you like me to generate unit tests for the removeSymlinks function to ensure it handles various scenarios correctly?

Co-authored-by: Erik Osterman (CEO @ Cloud Posse) <erik@cloudposse.com>
Copy link

mergify bot commented Mar 12, 2025

💥 This pull request now has conflicts. Could you fix it @Listener430? 🙏

@mergify mergify bot added the conflict This PR has conflicts label Mar 12, 2025
Listener430 and others added 5 commits March 12, 2025 18:36
Co-authored-by: Erik Osterman (CEO @ Cloud Posse) <erik@cloudposse.com>
Co-authored-by: Erik Osterman (CEO @ Cloud Posse) <erik@cloudposse.com>
Co-authored-by: Erik Osterman (CEO @ Cloud Posse) <erik@cloudposse.com>
Co-authored-by: Erik Osterman (CEO @ Cloud Posse) <erik@cloudposse.com>
@mergify mergify bot removed the conflict This PR has conflicts label Mar 12, 2025
@Listener430 Listener430 requested a review from osterman March 12, 2025 20:02
@osterman osterman added patch A minor, backward compatible change and removed no-release Do not create a new release (wait for additional code changes) labels Mar 13, 2025
Copy link

mergify bot commented Mar 13, 2025

💥 This pull request now has conflicts. Could you fix it @Listener430? 🙏

@mergify mergify bot added the conflict This PR has conflicts label Mar 13, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
website/docs/core-concepts/vendor/vendor-manifest.mdx (3)

556-564: Non-Recursive Pattern Example Insight
The "Non-Recursive Pattern That Doesn't Work" example clearly highlights the limitation of using a single asterisk (*) versus a recursive double asterisk (**). For enhanced clarity, consider adding a brief note explaining why * only matches immediate children and how that might lead to unintended inclusions.


570-573: Curly Brace Expansion Explanation
The explanation on using curly braces ({...}) to expand multiple patterns is useful. As a minor nitpick, consider using the typographical ellipsis (  ) for improved readability. For example, you might update the sentence as follows:

-Curly braces `{...}` allow for expanding multiple patterns into separate glob matches.
+Curly braces `{…}` allow for expanding multiple patterns into separate glob matches.
🧰 Tools
🪛 LanguageTool

[style] ~570-~570: Consider using the typographical ellipsis character here instead.
Context: ...ers, use **/demo-library/**/*. Using {...} for Multiple Extensions or Patterns ...

(ELLIPSIS)


[style] ~572-~572: Consider using the typographical ellipsis character here instead.
Context: ...le Extensions or Patterns Curly braces {...} allow for expanding multiple patterns...

(ELLIPSIS)


604-611: Key Takeaways Summary Style
The "Key Takeaways" section is a strong recap of the important points. Consider rephrasing the successive sentences that begin similarly and applying typographical ellipsis where appropriate (per the static analysis hints) to enhance style and readability.

🧰 Tools
🪛 LanguageTool

[style] ~608-~608: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...won't include deeper subdirectories. 3. Use {...} to match multiple extensions or...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~608-~608: Consider using the typographical ellipsis character here instead.
Context: ...t include deeper subdirectories. 3. Use {...} to match multiple extensions or direc...

(ELLIPSIS)


[style] ~611-~611: Consider using the typographical ellipsis character here instead.
Context: ...included_paths, excluded_paths, and {...} expansion, you can precisely control ...

(ELLIPSIS)

internal/exec/go_getter_utils.go (1)

262-263: Use updated logging pattern

There's a deprecated logging function being used here.

-log.Debug("Detected top-level repo with no subdir: appending '//.'", keyURL, maskedSrc)
+log.Debug("Detected top-level repo with no subdir: appending '//.'\n", "url", maskedSrc)
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ec9522 and 08c9f37.

📒 Files selected for processing (2)
  • internal/exec/go_getter_utils.go (4 hunks)
  • website/docs/core-concepts/vendor/vendor-manifest.mdx (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
internal/exec/go_getter_utils.go (2)
Learnt from: Listener430
PR: cloudposse/atmos#1061
File: internal/exec/go_getter_utils.go:74-75
Timestamp: 2025-03-12T18:52:31.814Z
Learning: In the `CustomGitDetector.Detect` method of `internal/exec/go_getter_utils.go`, verbose debug logging of raw URLs is intentionally kept for debugging purposes, despite potential credential exposure risks.
Learnt from: osterman
PR: cloudposse/atmos#984
File: internal/exec/go_getter_utils.go:103-109
Timestamp: 2025-03-12T18:52:31.814Z
Learning: When checking for subdirectories in GitHub URLs, use `parsedURL.Path` to check for "//" instead of the entire URL, as the scheme portion (e.g., "https://") will always contain "//".
🪛 LanguageTool
website/docs/core-concepts/vendor/vendor-manifest.mdx

[style] ~570-~570: Consider using the typographical ellipsis character here instead.
Context: ...ers, use **/demo-library/**/*. Using {...} for Multiple Extensions or Patterns ...

(ELLIPSIS)


[style] ~572-~572: Consider using the typographical ellipsis character here instead.
Context: ...le Extensions or Patterns Curly braces {...} allow for expanding multiple patterns...

(ELLIPSIS)


[style] ~602-~602: Consider using the typographical ellipsis character here instead.
Context: ... **/demo-library/**/archive/** Using {...} here prevents the need to write two s...

(ELLIPSIS)


[style] ~608-~608: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...won't include deeper subdirectories. 3. Use {...} to match multiple extensions or...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~608-~608: Consider using the typographical ellipsis character here instead.
Context: ...t include deeper subdirectories. 3. Use {...} to match multiple extensions or direc...

(ELLIPSIS)


[style] ~611-~611: Consider using the typographical ellipsis character here instead.
Context: ...included_paths, excluded_paths, and {...} expansion, you can precisely control ...

(ELLIPSIS)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Summary
🔇 Additional comments (24)
website/docs/core-concepts/vendor/vendor-manifest.mdx (5)

501-504: Section Introduction Clarity
The new "Vendoring with Globs" section is clearly introduced and sets the stage well for the detailed explanations that follow. It effectively outlines the purpose of glob patterns in controlling file inclusion/exclusion during vendoring.


509-533: Comprehensive Glob Syntax Definitions
The detailed list of glob pattern elements (e.g., *, **, ?, [abc], [a-z], {a,b,c}) is thorough and provides clear examples. This level of detail helps users understand how each element operates.


537-547: Clear Example: Excluding a Subdirectory
The example demonstrating how to exclude subdirectories (using included_paths and excluded_paths) is well articulated and easy to follow. It effectively shows the intended behavior of excluding, for instance, any stargazers folder within demo-library.


574-588: Matching Multiple File Extensions Example
The example showing how to match multiple file extensions using brace expansion is concise and informative. It clearly demonstrates that the shorthand is equivalent to writing out two separate rules, which enhances understanding.


591-601: Excluding Multiple Directories Example
The provided example using brace expansion to exclude directories (e.g., both stargazers and archive) is clear and practical. This example effectively demonstrates how to simplify exclusion rules. A quick review for consistency with earlier examples might be beneficial, but overall it’s well presented.

internal/exec/go_getter_utils.go (19)

66-70: Good extension of detector to support multiple Git providers

The renamed CustomGitDetector now properly handles multiple Git hosting providers instead of just GitHub. Adding the source field helps track the original URL throughout the process.


74-75: Debug logging looks good but consider implications

Adding debug logging here is helpful for troubleshooting. Remember that this method will log raw URLs which could contain credentials, but the retrieved learning indicates this is intentional for debugging purposes.


80-82: Good modularization with ensureScheme

Breaking out URL schema detection into a separate method improves code organization and readability.


86-89: Nice credential masking implementation

The error handling now properly masks any basic auth credentials before logging or returning errors. This is an important security practice.


95-98: Properly expanded host support

The detector now supports multiple Git hosts (GitHub, GitLab, Bitbucket) instead of just GitHub. Good extension of functionality.


107-112: Well-implemented shallow clone support

Setting depth=1 as default when not specified improves performance for large repositories. The implementation correctly preserves any user-specified depth value.


114-120: Proper URL credential masking in logs

Good job masking credentials before logging the final URL. The error handling for the masking operation is also well implemented.


137-152: Strong SCP-style URL handling

The ensureScheme method properly handles SCP-style URLs with good logging and credential masking.


154-174: Well-structured SCP-URL rewriting

The regex pattern handles SCP-style URLs correctly, and the method transparently converts them to proper SSH URLs.


186-197: Good token injection pattern

The token injection logic is well-structured and includes proper logging with masked credentials.


199-231: Comprehensive token resolution strategy

Good implementation of token resolution that tries multiple environment variables based on the hosting provider. The priority order makes sense.


233-253: Well-implemented provider-specific username handling

Different Git providers require different username conventions for token auth. This is handled correctly for each supported provider.


255-266: Fixed subdirectory handling

The adjustSubdir method correctly implements the PR objective by appending "//." to repository URLs that don't specify a subdirectory. The implementation now correctly checks parsedURL.Path for "//" as recommended in the retrieved learning.


270-270: Updated function signature correctly uses pointer

The signature now accepts a pointer to AtmosConfiguration, aligning with other changes in the codebase.


290-294: Clear documentation of custom detector behavior

The comments clearly explain why the original source is passed to the detector and how the subdirectory handling works.


302-311: Complete getters map implementation

The getters map now includes the custom git getter and all standard getters. The commented-out getters are helpfully preserved for future extension.


320-333: Clean implementation of CustomGitGetter

The CustomGitGetter properly extends the standard GitGetter and implements the necessary functionality to remove symlinks after a successful clone.


335-349: Well-implemented symlink removal

The removeSymlinks function correctly walks the directory tree and removes symlinks, addressing the PR objective. Good error handling and logging.


351-352: Updated function signature correctly uses pointer

The signature for DownloadDetectFormatAndParseFile now uses a pointer for atmosConfig, consistent with other changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Change that restores intended behavior conflict This PR has conflicts patch A minor, backward compatible change size/xl
Projects
None yet
4 participants