Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a flag to emit ASCII-only output #1395

Closed
dwightjack opened this issue Aug 25, 2014 · 102 comments
Closed

Add a flag to emit ASCII-only output #1395

dwightjack opened this issue Aug 25, 2014 · 102 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@dwightjack
Copy link

I'm working with Sass 3.4.

My sass source code is:

#test
  content:  "\f000"

when parsed it got converted to:

#test {
  content: "";
}

I'm using expanded style and sass syntax. I've noticed that switching to scss works correctly, anyway my entire codebase is in sass format :(

Is there any way to preserve original content unicode string?

Thanks in advance

@quantizor
Copy link

+1, I'm facing this as well and it's causing a lot of issues because now the SCSS file gets flagged as UTF-8, even though I specifically wrote it as ASCII. Then SASS outputs a file with a BOM, which ends up in the middle of my concatenated files and causes other headaches.

@lolmaus
Copy link

lolmaus commented Aug 26, 2014

now the SCSS file gets flagged as UTF-8, even though I specifically wrote it as ASCII.

That's in compliance with the CSS Syntax Module Level 3.

Then SASS outputs a file with a BOM, which ends up in the middle of my concatenated files and causes other headaches.

I've specifically tried it and Sass 3.4.1 does not produce a BOM for me (though it didn't convert \f000 into Unicode for some reason, i had to manually type into the source Sass).

Anyway, isn't it your concatenation tool's responsibility to produce valid files? BOM aside, the @charset directive should only appear in the very beginning of the resulting file. Sass takes care of that for you, but if you postprocess Sass-generated CSS files (which are CSS3-valid) and your postprocessor fails to produce a valid file, there isn't much Sass can do, is it?

@dwightjack
Copy link
Author

@lolmaus Compliance matters, but sometimes projects didn't comply :p

I've specifically tried it and Sass 3.4.1 does not produce a BOM for me (though it didn't convert \f000 into Unicode for some reason, i had to manually type  into the source Sass).

I've experienced the conversion problem just with .sass extension and sass syntax. switching to .scss works as expected

@lolmaus
Copy link

lolmaus commented Aug 26, 2014

@dwightjack, tried with .sass, still no BOM.

@dwightjack
Copy link
Author

@lolmaus yep the BOM issue doesn't come up by changing extension, what's changing is that with .scss \f000 is preserved, while with .sass it's converted in

@quantizor
Copy link

@lolmaus it's right in the documentation. If you output in compressed style and any utf-8 chars are detected, the file is outputted as UTF-8 with a BOM.

@quantizor
Copy link

I'm trying to force ASCII in every way I can think of, but 3.4's new behavior to convert the escaped characters into their real entities is defeating that effort.

@quantizor
Copy link

I'm using the scss CLI command to compile instead of sass, does that matter?

@nex3
Copy link
Contributor

nex3 commented Aug 28, 2014

This is the expected behavior. Sass parses strings according to the CSS spec, and the spec says to interpret Unicode escapes as Unicode characters. The only reason the escapes might be preserved is the time-saving optimization wherein certain property values aren't parsed if we can guarantee they don't contain any dynamic code.

What is the actual practical problem with producing UTF-8 output? Is it exclusively other tools that don't handle @charset declarations/BOMs correctly?

@quantizor
Copy link

Well I guess I will never upgrade to 3.4 then. This is a big issue and SASS
shouldn't be fooling around with the input values, especially if we
specifically put a @charset "ASCII" at the top.
On Aug 28, 2014 1:44 PM, "Natalie Weizenbaum" notifications@github.com
wrote:

This is the expected behavior. Sass parses strings according to the CSS
spec, and the spec says to interpret Unicode escapes as Unicode characters.
The only reason the escapes might be preserved is the time-saving
optimization wherein certain property values aren't parsed if we can
guarantee they don't contain any dynamic code.

What is the actual practical problem with producing UTF-8 output? Is it
exclusively other tools that don't handle @charset declarations/BOMs
correctly?


Reply to this email directly or view it on GitHub
#1395 (comment).

@nex3
Copy link
Contributor

nex3 commented Aug 28, 2014

SASS shouldn't be fooling around with the input values

This is literally Sass's entire job.

especially if we specifically put a @charset "ASCII" at the top.

The @charset declaration determines what the input character set is, not the output.

You still haven't answered my question: what practical problems does UTF-8 output cause?

@quantizor
Copy link

No, SASS's job is to rework selector trees, provide utility functions,
variables, etc. With the exception of changing color values to shortened
versions for compressed output, it historically has not tampered with the
values assigned to a property unless you specifically use a function of
some kind.

The issue is not UTF-8, but the fact that 3.4 outputs these files with a
byte order mark in compressed mode. This creates complications when
concatenating files together, because now all of a sudden there is a BOM in
the middle of the file and the browser CSS parser freaks out and drops the
next rule as invalid/malformed.

Could I set up another job to strip out BOM's? I guess, but SASS shouldn't
be including that character or doing this implicit conversion in the first
place.

@nex3
Copy link
Contributor

nex3 commented Aug 28, 2014

No, SASS's job is to rework selector trees, provide utility functions, variables, etc. With the exception of changing color values to shortened versions for compressed output, it historically has not tampered with the values assigned to a property unless you specifically use a function of some kind.

Sass has parsed and reformatted property values for a long time. The guarantee it offers is that plain CSS input will produce semantically identical CSS output; Unicode characters are semantically identical to their escape codes, so this guarantee is being upheld.

The issue is not UTF-8, but the fact that 3.4 outputs these files with a byte order mark in compressed mode. This creates complications when concatenating files together, because now all of a sudden there is a BOM in the middle of the file and the browser CSS parser freaks out and drops the next rule as invalid/malformed.

So it is an issue with the tooling. I suggest you use a concatenation tool (such as Sass itself) that knows how to parse CSS files according to the spec.

@quantizor
Copy link

If I intended to put a unicode character in my stylesheet, I would have pasted in the actual character. The escaped sequence is in there for a reason and before 3.4, SASS didn't screw with that.

This is very upsetting in general. I'm kind of in disbelief that you would introduce such a change without documenting it more visibly.

@nex3
Copy link
Contributor

nex3 commented Aug 28, 2014

I'm done arguing with you about this. I think I've stated my point.

This issue is still open because I think it makes sense to add a flag to Sass to tell it to emit ASCII output.

@nex3 nex3 changed the title Preserve unicode in content rule Add a flag to emit ASCII-only output Aug 28, 2014
@nex3 nex3 added the Feature label Aug 28, 2014
@nex3 nex3 closed this as completed Aug 28, 2014
@nex3 nex3 reopened this Aug 28, 2014
@chrisdrackett
Copy link

I've run into what (I think) is the same problem. I use icon fonts with my projects and in 3.4 I've run into a problem with the following code:

@for $i from 0 through 5
  &.step_#{$i}:before
    content: "\f20" + $i

before I would get output in my css file with \f200, \f201, etc.

Now I get ༠0, ༠1, etc, which obviously doesn't display my icon fonts as expected.

@lolmaus
Copy link

lolmaus commented Sep 3, 2014

@chrisdrackett, it shouldn't make any difference. Do you have your icon font applied correctly?

@chrisdrackett
Copy link

yes, when moving to the following code things work as expected:

    &.step_0:before
      content: "\f200"
    &.step_1:before
      content: "\f201"
    &.step_2:before
      content: "\f202"
    &.step_3:before
      content: "\f203"
    &.step_4:before
      content: "\f204"
    &.step_5:before
      content: "\f205"

@quantizor
Copy link

Yup. Things get converted whenever Ruby is needed to do an action on the input. If you don't do concatenation or run a function, it stays as expected.

@chriseppstein
Copy link

@nex3 It seems like there's no good way to construct an escape sequence now. I think we need a fn to construct a character from an escape value. E.g. char("f205") would return the equivalent of "\f205".

@nex3
Copy link
Contributor

nex3 commented Sep 4, 2014

@chriseppstein I agree we should have a function that does that, although that's more the purview of #659.

@danny-englander
Copy link

I just ran into this after a Sass 3.3.14 > 3.4.4 upgrade and it seems like adding quotes as mentioned above fixes it for me.

Before I had:
$fatag: \f02b;
... which subsequently broke with the upgrade.

Now I use:
$fatag: "\f02b";

... and my font-awesome renders as expected.

@quantizor
Copy link

The conversion happens when passing the character sequence or partial
sequence into any function or concatenation that uses Ruby under the hood.

A bare, quoted sequence should be fine as you mentioned, but it's not fixed
for the rest of us doing something with the codes.

@quantizor
Copy link

@nex3 Any ETA on when we might see this? Thanks!

@thenetimp
Copy link

@nex3 you say it shouldn't be an issue, however this thread proves that it is an issue, and some of us have had to hack mixins to get around it to get it to work.

@nex3
Copy link
Contributor

nex3 commented Dec 8, 2017

No one here has provided a reproducible case where Unicode output doesn't work in a browser.

@Fil923
Copy link

Fil923 commented Jan 18, 2018

W3C Validation fail for this reason

@nex3
Copy link
Contributor

nex3 commented Jan 18, 2018

@Fil923 Can you provide a repro case?

@nex3
Copy link
Contributor

nex3 commented Apr 5, 2018

I'm moving this issue to the new Ruby Sass repository because it's specific to Ruby Sass's implementation. Once it's there, I'm going to close it as "on ice" because Ruby Sass is deprecated and no additional features are planned for it.

@drewwells
Copy link

I'm moving this issue to the new Ruby Sass repository because it's specific to Ruby Sass's implementation. Once it's there, I'm going to close it as "on ice" because Ruby Sass is deprecated and no additional features are planned for it.

Please reopen this issue. Sending bugs to a deprecated project and immediately closing it there only frustrates everyone.

@drewwells
Copy link

drewwells commented Jun 16, 2018

A repo case as requested

$test: "\f00d";

p {
    a {
        content: $test;
    }

    b {
        content: "\f00c";
    }
}

Outputs non-sense

@charset "UTF-8";
p a {
  content: ""; }

p b {
  content: "\f00c"; }

@nex3
Copy link
Contributor

nex3 commented Jun 18, 2018

Please reopen this issue. Sending bugs to a deprecated project and immediately closing it there only frustrates everyone.

I don't know what you're looking for by asking this issue to be re-opened. No one is working on new features planned for Ruby Sass, so whether this issue is open or not this flag is not going to get implemented.

A repo case as requested

$test: "\f00d";

p {
    a {
        content: $test;
    }

    b {
        content: "\f00c";
    }
}

Outputs non-sense

@charset "UTF-8";
p a {
  content: ""; }

p b {
  content: "\f00c"; }

I requested a reproduction that shows the a Unicode escape rendering differently than a Unicode character in a browser. Certainly it's the case that the Unicode code point U+F00C is different than the sequence of code points U+5C U+66 U+30 U+30 U+63, but the CSS spec treats those as identical and as far as I'm aware all browsers follow the spec.

@augustotmw
Copy link

augustotmw commented Sep 11, 2019

Great day to everyone. 😄

Im having a almost same and as weird as the cases above, where my:

$icon-location: "\e90a";


.icon-location {
  &:before {
    content: $icon-location;
  }
}

is being converted into

.icon-location:before {
    content: "";
}

Does anyone know why its happening?

There is a way to pass some argument to sass/Ruby compiler proccess to avoid this "conversion to unicode" and keep the original text value?

The server where it happens is using node-sass@4.11.0, if it can help something.

Thanks in advance

PS.: @nex3 @TeresaPartidaS @nschonni @asottile

@bardware
Copy link

content: "";

How do you look at the generated css file? Try Notepad++ and "play" with the file encodings available.

@asottile
Copy link
Member

if it's in html, make sure to have <meta charset="utf-8">

@augustotmw
Copy link

Hi! Thanks by the answers 😃

content: "";

How do you look at the generated css file? Try Notepad++ and "play" with the file encodings available.

I looked at it after run a ng build process and opening the generated css file at my dev env.

Or when I debug at inspector, at hom env.

if it's in html, make sure to have <meta charset="utf-8">

The variable is in a _variables.scss file, and its imported and applied in a other file called _icons.scss. This last one is imported at a style.scss file, and this file generate the css file with the encode problem.

I will try to apply the metatag and see if it solves the case.

Thanks in advance

And best regardings 😄

@clshortfuse
Copy link

clshortfuse commented Jul 27, 2020

So, I'm trying to get a zero-width unicode character to work with SASS. It won't appear without a hex editor thanks to SASS reinterpreting that. On normal CSS, it's:

div::before {
  content: "\200B";
}

But SASS will rewrite it as:

div::before {
  content: "";
}

It's a little frustrating trying to debug with an invisible character since SASS wants to rewrite it.

@jathak
Copy link
Member

jathak commented Jul 28, 2020

@clshortfuse: This issue was for ASCII-only support in Ruby Sass, which is no longer developed. You should follow sass/dart-sass#568 for the status of this feature in Dart Sass.

@cweiske
Copy link

cweiske commented Jan 29, 2021

Today I was bitten by this; Drupal 7's CSS concatenator does not remove BOMs and thus the CSS breaks - https://www.drupal.org/project/drupal/issues/1833356

@nicolaskopp
Copy link

Having the same issue. We recently upgraded to latest dart sass ( https://www.npmjs.com/package/sass v.1.32.13 )

this
style.scss

 i:before {
                font-family: "metro-mf-icons";
                content: "\F105";
            }

gets compiled to this

style.css

header .sub-header-store .store-div .store-selector-toggle i:before {
  font-family: "metro-mf-icons";
  content: "";
}

quite annyoing as we have like a Quadrillion scss files which we now have all to go through manually to escape any unicode sequences with some mixins.

While I understand that sass is "just following some nice css3 level specifications", it is effectively breaking working code and I would very much like having a compiler switch which effectively turns this conversion off. It breaks functionality. No update of a software should by default break existing functionality.

@nex3
Copy link
Contributor

nex3 commented May 19, 2021

@nicolaskopp To be clear: this is not a breaking change. You do not have to manually escape Unicode sequences. As long as you have a @charset "UTF-8"; or UTF-8 BOM at the beginning of your stylesheet—which Sass ensures—all browsers will interpret a literal U+F105 character identically to the escape squence \F105.

this
style.scss

 i:before {
                font-family: "metro-mf-icons";
                content: "\F105";
            }

gets compiled to this

style.css

header .sub-header-store .store-div .store-selector-toggle i:before {
  font-family: "metro-mf-icons";
  content: "";
}

That's not true. It gets compiled to

@charset "UTF-8";
i:before {
  font-family: "metro-mf-icons";
  content: "";
}

with a @charset declaration.

I'm going to lock this thread since this is not relevant to the Sass language repository. Feel free to follow up on sass/dart-sass#568.

@sass sass locked as resolved and limited conversation to collaborators May 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests