Skip to content

API proposal: expose System.Text.Encoding.Latin1 getter publicly #31549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
GrabYourPitchforks opened this issue Nov 22, 2019 · 8 comments · Fixed by #37550
Closed

API proposal: expose System.Text.Encoding.Latin1 getter publicly #31549

GrabYourPitchforks opened this issue Nov 22, 2019 · 8 comments · Fixed by #37550
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.Encoding
Milestone

Comments

@GrabYourPitchforks
Copy link
Member

GrabYourPitchforks commented Nov 22, 2019

The ISO-8859-1 encoding is used somewhat commonly in web scenarios since it's the encoding normally specified for headers. We should consider having a fast-access property for this encoding just like we have for Encoding.UTF8.

The workaround for now is to use Encoding.GetEncoding("iso-8859-1", ...), which works but is not convenient or optimized.

API Proposal

namespace System.Text
{
    public class Encoding
    {
        /* NEW static propery getter */
        public static Encoding Latin1 { get; }
    }
}

The return value of this would be equivalent to calling Encoding.GetEncoding("iso-8859-1"). The byte to char conversion is a naive widening operation. The char to byte conversion is a naive narrowing operation, with any char values above 0x007F being best-fit mapped to an ASCII byte. If a best-fit mapping is unavailable then the character is converted to (byte)'?'.

If the developer wishes to change the fallback behavior or to enable best fit mapping, they can call Encoding.GetEncoding("iso-8859-1", ...) and pass custom fallback behaviors as needed for the newly-created instance.

This proposal does not intend on making the existing internal Latin1Encoding class public.

/cc @davidfowl

@tarekgh
Copy link
Member

tarekgh commented Nov 22, 2019

We already have it and called Encoding.Latin1

private static Encoding Latin1 => Latin1Encoding.s_default;

@stephentoub
Copy link
Member

We already have it and called Encoding.Latin1

I think @GrabYourPitchforks is proposing it be made public.

@tarekgh
Copy link
Member

tarekgh commented Nov 24, 2019

Ah, somehow I had the impression that Encoding.Latin1 is already public :-) so it make sense to expose it (maybe with different name).

@GrabYourPitchforks
Copy link
Member Author

I'd be fine with keeping the property name Latin1. The actual concrete type doesn't need to be made public IMO.

@tarekgh
Copy link
Member

tarekgh commented Nov 25, 2019

I agree we don't have to expose Latin1Encoding class. For the naming, we need to choose the name which people more familiar with:

MIME / IANA ISO-8859-1
Alias(es) iso-ir-100, csISOLatin1, latin1, l1, IBM819, CP819

if we can have the name indicate ISO-8859-1 that would be better I think.

@scalablecory
Copy link
Contributor

I think this API would be fine.

Outside of the web (iirc HTML5 actually considers the two identical), Windows-1252 is in my experience even more common than ISO-8859-1.

@GrabYourPitchforks GrabYourPitchforks changed the title Add ISO-8859-1 accessor to the Encoding class API proposal: expose System.Text.Encoding.Latin1 getter publicly Jan 20, 2020
@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the Future milestone Feb 1, 2020
@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@GrabYourPitchforks
Copy link
Member Author

I updated the proposal to perform best-fit mapping during the narrowing conversion. This provides compatibility with the existing Latin1Encoding default behaviors.

@maryamariyan maryamariyan removed the untriaged New issue has not been triaged by the area owner label Mar 3, 2020
@stephentoub stephentoub modified the milestones: Future, 5.0 May 26, 2020
@stephentoub stephentoub added the blocking Marks issues that we want to fast track in order to unblock other important work label May 26, 2020
@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review blocking Marks issues that we want to fast track in order to unblock other important work labels May 26, 2020
@terrajobst
Copy link
Contributor

  • Looks good.
  • We won't be exposing the Latin1Encoding type, because it's not necessary
namespace System.Text
{
    public partial class Encoding
    {
        public static Encoding Latin1 => Encoding.GetEncoding("iso-8859-1");
    }
}

@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.Encoding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants