Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add device orientation information to header information #5894

Open
wants to merge 10 commits into
base: dev
Choose a base branch
from

Conversation

sbfkcel
Copy link

@sbfkcel sbfkcel commented Mar 3, 2025

This is useful if need to draw the device appearance correctly.

Related Discussions

#5704
#3605
#925

@rom1v
Copy link
Collaborator

rom1v commented Mar 3, 2025

Hi,

Thanks for your work.

However, as I said in #5704 (comment), there is no really need to know the device orientation, since it can be inferred from the frame size in practice, and there are cases where the "orientation" to write is not clear.

You replied:

But in some cases, it would be good to have the correct device rotation direction. For example, it is necessary to correctly draw the appearance of a device

You can do this based on the actual frame size.

In theory, it could make sense to pass the device orientation (at least in some cases) in packets headers (even if it could be argued that it's overkill to pass it for each packet/frame).

But then, the orientation must exactly match the frame it describes, which is not guaranteed here. There is no mechanism to make sure that the frame produced by MediaCodec is in the current device orientation (as returned by Device.getCurrentRotation(this.displayId)), it may have changed, so a few frames may be tagged with an incorrect orientation. On the contrary, the frame size is guaranteed to be in sync.

@sbfkcel
Copy link
Author

sbfkcel commented Mar 3, 2025

Thanks for reply

The size of the screen can only be used to determine whether it is displayed horizontally or vertically.

If it is just a pure screen display, then it is meaningless to obtain the device direction.

However, if you need to draw the hole position of the front camera in the screen normally, or if you need to draw the position of the device appearance button (power button) as discussed above, then you need an accurate device direction.

I used to monitor the change of the screen size, and when the size changes, I used ADB to obtain the screen direction, which will cause a very large delay.

image

From the current test, the implementation in the code is relatively ideal. The human eye cannot clearly distinguish that the obtained value is abnormal.

Or is it feasible for us to make this item an optional method?

Currently it is also sent as part of the metadata extension. It should not be considered redundant.

@rom1v
Copy link
Collaborator

rom1v commented Mar 5, 2025

However, if you need to draw the hole position of the front camera in the screen normally, or if you need to draw the position of the device appearance button (power button) as discussed above, then you need an accurate device direction.

OK.

What is the client on your screenshot? Is it a modified scrcpy client, or an alternative client you wrote?

The human eye cannot clearly distinguish that the obtained value is abnormal.

If the orientation is sent "inline", it must be in sync with the encoded frames. Btw, it could allow to print a better error message for #1645.

The current protocol has two "packet types":

  • a codec packet; for video: codec id (u32), width (u32), height (u32)
  • a media packet for each frame (with a 12 bytes header)

To support your use case, I suggest to add a new packet type:

  • a codec packet, with only the codec id (u32, i.e. 4 bytes)
  • a session packet for video (12 bytes): width (u32), height (u32), orientation (u3, there a 8 possible orientation with --capture-orientation), sent every time the encoding is restarted (e.g. on rotation)
  • a media packet for each frame (with a 12 bytes header)

The session packet could be distinguished from the media packet with a single bit, and for simplicity both headers would have 12 bytes.

It would be sent when a new encoding session starts, with the values in sync to what is passed to the encoder.

@sbfkcel
Copy link
Author

sbfkcel commented Mar 6, 2025

Yes, I wrote an alternative client(browser-based).

I will try to make some adjustments based on your suggestions and then submit a new implementation.

@sbfkcel
Copy link
Author

sbfkcel commented Mar 11, 2025

@rom1v

Based on your recommendations, I have implemented the modifications and completed local testing and verification.

A new video session packet has been added, consisting of a 12-byte header + 16-byte data payload:

  • The header format remains consistent with other data headers, distinguished by the flag long PACKET_FLAG_VIDEO_SESSION = 1L << 61.
  • The 16-byte data payload comprises four fields: Orientation, Width, Height, and Flip, with each field occupying 4 bytes.
    Key design details:

Orientation reflects the device’s original orientation, decoupled from the flip parameter in the --display-orientation argument. This separation allows users to independently handle orientation and flip states in their business logic for greater flexibility.

The -V debug flag enables debug output on the client side, displaying parsed metadata.

Build and execution command:

meson setup x --buildtype=release --strip -Db_lto=true && ninja -Cx && ./run x -V debug 

Example debug output:

DEBUG: Orientation=1, Width=1920, Height=864, Flip=False, DataSize=16  

@rom1v
Copy link
Collaborator

rom1v commented Mar 11, 2025

Thank you for your work.

After a quick glance, here are some remarks.

In my previous comment, the idea was to replace:

   CODEC packet      MEDIA packets
[codec|width|height] […] […] […] […] […]
                                ^
                                rotation occurs

by:

CODEC      SESSION packet                    SESSION packet
[codec] [width|height|orient] […] […] […] [width|height|orient] […] […]
                                         ^
                                         rotation occurs

The session packet would be 12 bytes (like media packets, this simplifies reader the header) without payload.

Although scrcpy currently does not need/use this data, in principle I would be ok to add a session packet like that (a first commit could just refactor to include width and height for each session, without the orientation, then the orientation would be added in a separate commit). But this has consequences: the information should be passed "downstream" to the decoder, display, recorder and v4l2 (it impacts the way they work, since the codec context needs the initial size, which would not be in the "codec packet" anymore).

The 16-byte data payload comprises four fields: Orientation, Width, Height, and Flip, with each field occupying 4 bytes.

The orientation/flip only requires 3 bits:

scrcpy/app/src/options.h

Lines 72 to 85 in 7998811

// ,----- hflip (applied before the rotation)
// | ,--- 180°
// | | ,- 90° clockwise
// | | |
enum sc_orientation { // v v v
SC_ORIENTATION_0, // 0 0 0
SC_ORIENTATION_90, // 0 0 1
SC_ORIENTATION_180, // 0 1 0
SC_ORIENTATION_270, // 0 1 1
SC_ORIENTATION_FLIP_0, // 1 0 0
SC_ORIENTATION_FLIP_90, // 1 0 1
SC_ORIENTATION_FLIP_180, // 1 1 0
SC_ORIENTATION_FLIP_270, // 1 1 1
};

This would allow to make the session packet payload-less.

In your implementation, you should not add display-specific data to SurfaceEncoder, which encodes any surface (not necessarily a display), for example the camera. Check ScreenCapture instead. And from there, you already have all the orientation data for the session, no need to add new methods in WindowManager and Device a priori.

@sbfkcel
Copy link
Author

sbfkcel commented Mar 11, 2025

OK, I did overlook factors like the camera. I'll submit a new implementation plan later.

@sbfkcel
Copy link
Author

sbfkcel commented Mar 12, 2025

Through testing and verification, it is found that the changes in each device direction cannot be captured in ScreenCapture. For example: 0->2, 1->3, when the screen size does not change, the logic will not be called, but the physical device direction has changed.

Rotating from 0->1->2->3 in sequence will trigger re-capture because the size has changed.

Therefore, perhaps the best way is to implement it in SurfaceEncoder. By adding a parameter to SurfaceEncoder to distinguish whether the device is a display, or is there another better way?

@rom1v
Copy link
Collaborator

rom1v commented Mar 12, 2025

it is found that the changes in each device direction cannot be captured in ScreenCapture. For example: 0->2, 1->3, when the screen size does not change, the logic will not be called, but the physical device direction has changed.

Because currently the encoding session is "reset" only if the size changes:

private void checkDisplaySizeChanged() {
DisplayInfo di = ServiceManager.getDisplayManager().getDisplayInfo(displayId);
if (di == null) {
Ln.w("DisplayInfo for " + displayId + " cannot be retrieved");
// We can't compare with the current size, so reset unconditionally
if (Ln.isEnabled(Ln.Level.VERBOSE)) {
Ln.v("DisplaySizeMonitor: requestReset(): " + getSessionDisplaySize() + " -> (unknown)");
}
setSessionDisplaySize(null);
listener.onDisplaySizeChanged();
} else {
Size size = di.getSize();
// The field is hidden on purpose, to read it with synchronization
@SuppressWarnings("checkstyle:HiddenField")
Size sessionDisplaySize = getSessionDisplaySize(); // synchronized
// .equals() also works if sessionDisplaySize == null
if (!size.equals(sessionDisplaySize)) {
// Reset only if the size is different
if (Ln.isEnabled(Ln.Level.VERBOSE)) {
Ln.v("DisplaySizeMonitor: requestReset(): " + sessionDisplaySize + " -> " + size);
}
// Set the new size immediately, so that a future onDisplayChanged() event called before the asynchronous prepare()
// considers that the current size is the requested size (to avoid a duplicate requestReset())
setSessionDisplaySize(size);
listener.onDisplaySizeChanged();
} else if (Ln.isEnabled(Ln.Level.VERBOSE)) {
Ln.v("DisplaySizeMonitor: Size not changed (" + size + "): do not requestReset()");
}
}
}

rom1v added a commit that referenced this pull request Mar 12, 2025
Reset encoding on rotation even if the resulting size is the same.

Refs <#5894 (comment)>
@rom1v
Copy link
Collaborator

rom1v commented Mar 12, 2025

With this change, it resets on rotation even if the size is the same: a98e500

@rom1v
Copy link
Collaborator

rom1v commented Mar 12, 2025

On a PR, please never include merge commits, always rebase on the latest dev branch.

@sbfkcel
Copy link
Author

sbfkcel commented Mar 12, 2025

OK, wrong operation

@rom1v
Copy link
Collaborator

rom1v commented Mar 12, 2025

With this change, it resets on rotation even if the size is the same: a98e500

On the other hand, it causes an unnecessary reset (since it works without it). This is just a corner case, but in theory we should NOT reset the encoding, but still send a new session packet to signal the new rotation.

But tht's probably not possible in a non racy way (we can't determine exactly from which frame the new rotation is taken into account).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants