Discussion:
Evaluation of CIE dE*, dE94 and dE00 Color Difference Formulas
(too old to reply)
Timo Autiokari
2006-03-26 18:34:07 UTC
Permalink
Hi, I just published the page: Evaluation of the the CIE Color
Difference Formulas, http://www.aim-dtp.net/aim/evaluation/cie_de/index.htm

There are 4 comparisons at color differences 8, 3, 2 and 1, they are
on-top-of-each-other comparisons, by the three CIE dE formulas (1976,
1994 and 2000). Each comparison has 72 segments (reference colors) that
are compared against 10 nearby colors, at the specified dE distance.
Also the original, as a Photoshop Lab mode document, is available for
download.

This evaluation clearly shows how incorrect the transfer function of the
CIELAB L* really is. The latest formula, dE(2000), performs worst in
this respect, it manages to amplify the error from the already faulty
transfer function.

In my opinion the CIE color difference formulas are not useful at all
for normal, everyday, real-life use.

Timo Autiokari
http://www.aim-dtp.net
Mike Russell
2006-03-26 21:04:33 UTC
Permalink
Hi, I just published the page: Evaluation of the the CIE Color Difference
Formulas, http://www.aim-dtp.net/aim/evaluation/cie_de/index.htm
There are 4 comparisons at color differences 8, 3, 2 and 1, they are
on-top-of-each-other comparisons, by the three CIE dE formulas (1976, 1994
and 2000). Each comparison has 72 segments (reference colors) that are
compared against 10 nearby colors, at the specified dE distance. Also the
original, as a Photoshop Lab mode document, is available for download.
Thanks for providing this. Small error: your last table is labelled
deltaE=2 instad of 1.
This evaluation clearly shows how incorrect the transfer function of the
CIELAB L* really is. The latest formula, dE(2000), performs worst in this
respect, it manages to amplify the error from the already faulty transfer
function.
It would also be intersting to know why this is the case, and perhaps
propose a correction.
In my opinion the CIE color difference formulas are not useful at all for
normal, everyday, real-life use.
Lab has been in use for several decades in the commercial world. To my
pragmatic soul econonic benefit is the definition of useful. It may also be
true that there is another color difference formula that will work better.

Perhaps you could design such a formula, or are you perhaps - meaning no
insult - the Pancho Villa of the color world, and Lab itself, as well as the
CIE XYZ space should be discarded :-)?
--
Mike Russell
www.curvemeister.com
Graeme Gill
2006-03-27 01:04:43 UTC
Permalink
Post by Mike Russell
It would also be intersting to know why this is the case, and perhaps
propose a correction.
I'll venture to make this one observation in the case (normally
I'd ignore this sort of madness - in fact my news reader is set
to automatically ignore certain postings on this newsgroup). Don't
be mislead. The CIE delta E formula were arrived at with a lot of
rigorous experimentation that was careful to isolate the effects
being tested (ie. color differences).

Consider this:

The layout of the test squares on the page you refer to, is one that
relies on the boundary of the squares to emphasis that a square has
a "different" color. It is well known that the human eye is less
spatially sensitive to pure chromatic differences than luminance
differences (see "Color Appearance Models, 1st edition, page 33,
by Mark D. Fairchild), so all these images prove conclusively
is that, yes indeed, we are more spatially sensitive to luminance
edges than chromatic edges. Any rigorous evaluation of the Delta E
formula would have to avoid this effect.

Graeme Gill.
t***@aim-dtp.net
2006-03-27 07:33:02 UTC
Permalink
Post by Graeme Gill
I'll venture to make this one observation in the case (normally
I'd ignore this sort of madness - in fact my news reader is set
to automatically ignore certain postings on this newsgroup).
For the readers, the above innuendo is because Mr. Graeme Gill provides
the Argyll Color Management System that is practically totally based on
dE(1976) color difference.
Post by Graeme Gill
The CIE delta E formula were arrived at with a lot of rigorous
experimentation that was careful to isolate the effects being
tested (ie. color differences).
That is so. But in a viewing situation that has absolutely no relevance
with the normal viewing situation that we have, without an exception,
in our everyday life.
Post by Graeme Gill
The layout of the test squares on the page you refer to, is one that
relies on the boundary of the squares to emphasis that a square has
a "different" color. It is well known that the human eye is less
spatially sensitive to pure chromatic differences than luminance
differences (see "Color Appearance Models, 1st edition, page 33,
by Mark D. Fairchild),
A useful color difference formula should take that luminance
sensitivity properly into account. Or in an other words, a color
difference formula should accurately quantify the color difference.
Post by Graeme Gill
so all these images prove conclusively is that, yes indeed, we are
more spatially sensitive to luminance edges than chromatic edges.
Your thinking seems to be in a closed loop. It does not matter at all
what kind of difference there is, a color difference formula should
accurately quantify that perceptual difference.

Timo Autiokari
t***@aim-dtp.net
2006-03-27 06:42:45 UTC
Permalink
Post by Mike Russell
It would also be intersting to know why this is the case, and perhaps
propose a correction.
Like I wrote there, the reason is that the CIE color-difference
formalas were created using a special, totally different kind of
viewing situation than how the vision is "biased" in the real-life. The
current CIE formulas propably are highly valuable for basic
vision/perception reseach but for the most of those application areas
that these formulas are being used (textile, paint, graphic art,
imaging etc) they really are not useful at all.

A totally new approach for the color difference is indeed needed, such
that reliably quantifies the color differences that we see in the
real-life compex scenes (where the vision can not adapt to any single
particular color difference).

Before the proposal of that new color difference formula there has to
be generally accepted undestanding about the shortcomings of the
current ones, this is the reason for my comparison. Routinely people
have been simply taking the results from the basic vision/perception
reseach and have applied those results, should I say blindly, to what
ever application area, this generates plenty of problems and
misunderstanding.
Post by Mike Russell
It may also be true that there is another color difference formula that will work better.
Indicate one and I will add it to the comparison. All the color
difference formulas that I know of are based on fully adapted
evaluation between two color patches with dark surroundings (those two
patches will set the adaptation of the vision).

We never encounter this kind of viewing situation in the real-life. A
simple example: During the day we do not see the stars on the clear sky
because the light adaptation of the vision is set by the whole scene
that is lit by the strong daylight. During the night we do see them
since the light adaptation of the vision is at way different level.

So, when we view a normal everyday scene, both the light adaptation and
the chromatic adaptation of the vision are set according to that whole
scene. What ever color difference there is in that scene, the vision is
totally unable to adapt to that particular color difference (but it
will fully adapt to that same color difference when it is presented in
that viewing situation that the color scientists used when they created
the CIE color difference formulas).
Post by Mike Russell
Perhaps you could design such a formula,
Just give me the resources, like accurate instruments, and I will do
that!
Post by Mike Russell
or are you perhaps - meaning no insult - the Pancho Villa of the color world,
and Lab itself, as well as the CIE XYZ space should be discarded :-)?
Certainly not the CIE XYZ. The CIELAB is just a very poor derivation
from the CIE XYZ, maybe it has some uses but one has to know the
shortcomings that it has.

Timo Autiokari
tlianza
2006-03-27 00:19:37 UTC
Permalink
Timo,

I went to your link and I take exception to the following discourse:
"
Current ICC color management gives one major trouble for color evaluations,
it does not yet manage the blackpoint no matter how accurately the display
is calibrated and profiled. When an RGB=0,0,0 or L*a*b*=0,0,0 color code is
shown on the monitor screen the resulting color is not anywhere near to
absolute black.

"

This is simply not true. The CURRENT version 4 profile
(http://www.color.org/ICC1V42.pdf) provides complete management of a
PHYSICALLY REALIZABLE black point. The TRC in a monitor profile is
supposed to be Canonical, meaning that it is a direct result of measuring
the display. This means that the RGB 0,0,0 entry in the TRC curve should
never be zero (unless of course the monitor is a black hole...). The
physical definition of "black" in the PCS is defined on Page 74 of the
specification. The neutral white is defined as an 89% reflectance with a
density range of 2.4593. This yields a maximum absolute encoded black of a
density of approximately 2.50. Naturally, this conversion leads to a
non-zero L*for black. This is the minimum useable L* value in the PCS.

The .3 cd/m^2 black that you refer to typically has nothing to do with
ambient. Monitors are routinely set to this value in total darkness. If
you set the display much lower, you enter into a region of pinch off which
often yields no change in luminance for changes in input value. You will
see that your dark luminance is within .02 log units of the maximum encoded
black density in the PCS, when referred to the peak luminance that you
mentioned. This didn't happen by accident.

It is true, that many people, including myself, often generated profiles
incorrectly, but it is patently false to suggest that "ICC color management"
"does not yet manage the blackpoint" . That could be a problem in version
2 profiles, but not in the current specification.

I won't argue your issue with L*, its faults have been discussed by people
with far more experience and knowledge than you...



T. Lianza
--
Tom Lianza
Director of Display and Capture Technologies
GretagMacbeth LLC
3 Industrial Drive
Unit 7&8
Windham, NH 03087
603.681.0315 x232 Tel
603.681.0316 Fax
Hi, I just published the page: Evaluation of the the CIE Color Difference
Formulas, http://www.aim-dtp.net/aim/evaluation/cie_de/index.htm
There are 4 comparisons at color differences 8, 3, 2 and 1, they are
on-top-of-each-other comparisons, by the three CIE dE formulas (1976, 1994
and 2000). Each comparison has 72 segments (reference colors) that are
compared against 10 nearby colors, at the specified dE distance. Also the
original, as a Photoshop Lab mode document, is available for download.
This evaluation clearly shows how incorrect the transfer function of the
CIELAB L* really is. The latest formula, dE(2000), performs worst in this
respect, it manages to amplify the error from the already faulty transfer
function.
In my opinion the CIE color difference formulas are not useful at all for
normal, everyday, real-life use.
Timo Autiokari
http://www.aim-dtp.net
t***@aim-dtp.net
2006-03-27 07:09:59 UTC
Permalink
Post by tlianza
This is simply not true. The CURRENT version 4 profile
(http://www.color.org/ICC1V42.pdf) provides complete management of a
PHYSICALLY REALIZABLE black point.
Aaah, yes, that is so, thank you very much. I will reformulate that
paragraph.
Post by tlianza
The .3 cd/m^2 black that you refer to typically has nothing to do with
ambient.
Sure has. The screen surface does reflect the ambient light, that is
the very reason why the ambient light level has to be low for all color
critical work.
Post by tlianza
Monitors are routinely set to this value in total darkness.
That is totally incorrect to do.
Post by tlianza
If you set the display much lower, you enter into a region of pinch off
which often yields no change in luminance for changes in input value.
Hmmm. I warmly suggest that you'd inspect the behavour of these
displays, by actually doing some measurements.
Post by tlianza
It is true, that many people, including myself, often generated profiles
incorrectly, but it is patently false to suggest that "ICC color management"
"does not yet manage the blackpoint" . That could be a problem in version
2 profiles, but not in the current specification.
Ok ok, I will reformulate that paragraph. I have v4 profiled
displays... maybe Photoshop does not follow the v4 spec or maybe the
measuring gadget is not able to measure the blackpoint correctly.
Post by tlianza
I won't argue your issue with L*, its faults have been discussed
by people with far more experience and knowledge than you...
Indeed is has been discussed by people with far more experience and
knowledge than what I have. The thing is that they have experience and
knowledge in an other application area than what I have... and they are
strongly confined into their precious "sandbox", not willing to see
that there are other application areas also. I'm more interested in the
real word applications of accurate color.

Timo Autiokari
Gernot Hoffmann
2006-03-27 15:41:55 UTC
Permalink
Timo,

thanks for the presentation.

IMO, the problem has its root already in the use of CIE-Y.
Shouldn't we expect that for 256 random samples, sorted by Y,
each next sample will look brighter than the previous ?
That's not the case. Plenty examples are here in chapter 7:
http://www.fho-emden.de/~hoffmann/gray10012001.pdf

One reason is the Helmholtz-Kohlrausch effect - saturated
colors appear brighter, for the same Y-value.

Another reason is, as already explained by G.G, that a side
by side presentation will emphasize Y-differences (Mach
band effect, lateral inhibition).
But this leads more to a kind of 'corrugated iron effect'
instead of general brightness shifts, for sufficiently large
patches.

If the practical application of the CIE system should exclude
facing color patches, then there is indeed something wrong -
the experiments were of limited value for image processing
and printing.

Best regards --Gernot Hoffmann
Danny Rich
2006-03-28 03:26:37 UTC
Permalink
I feel a need to write some comments about this thread at this point. Not
so much in defense of CIELAB - which my own doctoral dissertation
challenged - but in challenge of the generalization that Photoshop images
are the only representation of practical reality.

The CIELAB L* function is based on the Munsell Value function. This
function was derived by Dean Judd, Dorothy Nickerson and Sidney Newhall in
1943. It was based on thousands of visual judgements under standard
daylight (as supplied by the National Bureau of Standards of the USA)
holding color chiips side by side so that there was minimal of no noticeable
boundary. The belief that a Photoshop display of two colored patches next
to one another is somehow unique is just plain ignorance. Every study
submitted to the CIE for consideration in the derivation of any of the color
spaces or color difference or color tolerance formulas has been based on
visual judgements of two or more color patches shown side by side with
minimal borders. Paint, plastics and even textile shaders have used
side-by-side visual judgements throughout the history of their art. There
is even a painting by the Flemish painter Reuben of master dyers and his
apprentice looking at two pieces of yarn, wrapped side-by-side on a wooden
slat - being judged for color match.

In addition, the newer formulas have utilized carefully calibrated CRT
displays for simulation of the color differences. The equation known as
CIE94 - which has been summarily dismissed as "useless" was weighted 2/3 for
the CRT judgements and 1/3 for combination of more than 10,000 historical
judgements of inks, paints and textiles. The visual judgements were made
first and then the equations derived to fit the visual data. The residual
errors in the formulas are approximately the same as the reproducibility
errors between experienced visual judges of colored materials.

The Helmholtz-Kohlrausch effect is a well known short coming of the CIE
system but that's because the system was not designed to solve that problem.
The H-K effect does not occur routinely in industrial colorimetry because
most industries do not attempt to compare the lightness properties of
specimens with large hue differences. All textiles and clothing, most
automotive interiors, most commercial and retail paints and all engineering
plastics are judged for color using CIELAB and/or one of the tolerance
equations in Timo's original posting. As Mike Russell pointed out - there
is a lot of commercial applications of these tools, in which if the tools
did not work, there would clearly be no financial incentive to use the tool.
But, despite its shortcomings, the tools do work and at one time, the Ford
Motor company would pay its suppliers a premium if they did not judge the
product visually but only using CIELAB and the CMC tolerance equation. This
is because the instrumental evaluation yield a higher quality, more
consistent color than the human observer. Marks & Spenser has observed the
same effect and has routinely stopped sending swatches of its color
standards - relying instead - only on the instrumental measurements and
CIELAB and CMC calculations. How much more "every day" can a product become
than the clothes we wear or the cars in which we ride?

Danny Rich
Post by Gernot Hoffmann
Timo,
thanks for the presentation.
IMO, the problem has its root already in the use of CIE-Y.
Shouldn't we expect that for 256 random samples, sorted by Y,
each next sample will look brighter than the previous ?
http://www.fho-emden.de/~hoffmann/gray10012001.pdf
One reason is the Helmholtz-Kohlrausch effect - saturated
colors appear brighter, for the same Y-value.
Another reason is, as already explained by G.G, that a side
by side presentation will emphasize Y-differences (Mach
band effect, lateral inhibition).
But this leads more to a kind of 'corrugated iron effect'
instead of general brightness shifts, for sufficiently large
patches.
If the practical application of the CIE system should exclude
facing color patches, then there is indeed something wrong -
the experiments were of limited value for image processing
and printing.
Best regards --Gernot Hoffmann
Gernot Hoffmann
2006-03-28 13:25:06 UTC
Permalink
Danny,

thanks for the feedback.
Yes, image processing isn't the center of the world of
color science.
Photoshop isn't the only program (my illustrations were
made by my program Zebra).

For the test description I should have replaced 'side by
side' and 'facing' by 'adjacent': the patches share an
edge.

http://www.fho-emden.de/~hoffmann/swatch16032005.pdf

This doc shows on p.20... planes of constant Lab hue
by 15° steps, like the common Munsell representation.
The patches are separated from each other.
In my impression, the colors for constant L*, which
means constant Y as well, show a visually increasing
brightness with increasing chroma (saturation).

If this can be generalized (as perceived by other
observers) then there is IMO a bug in the CIE concept.

Using CIELab for printer calibration is IMO agreeable.
I had printed posters for an exhibition and established
a lighting system.
The printing as usual for D50 and 500Lux (one standard).
The actual viewing conditions were about 3000K and
min. 50Lux. The posters didn't look wrong.
This says, that it is by no means necessary to reproduce
images colorimetrically correct for actual lighting in
order to achieve a pleasant appearance.

Best regards --Gernot Hoffmann
Danny Rich
2006-03-29 00:42:09 UTC
Permalink
The display for you doc on p.20 has also been shown to be true for the
Munsell system - which is, as I have stated, completely visually based. It
is also true for the OSA and the NCS systems but to a lesser degree -
perhaps because they are more recent visual mappings. I do not believe this
to be an error in the CIE system but rather an artifact in CIELAB derived
from the Munsell system against which it has been optimized and the Munsell
renotations which assumed that Judd's Value function could be equally
applied to X and Z to form an opponent system leaving luminance factor and
lightness to be assumed to be orthogonal to the chromaticness. CIE
Technical Committee 1-59 has concluded that such assumptions must not be
included in the next CIE recommended color space metric.

Still we must be careful not to mix linear properties, like luminance and
lightness with nonlinear properties like brightness. Further concepts like
chroma and saturation are not necessarily interchangeable.

Danny Rich


"Gernot Hoffmann" <***@fho-emden.de> wrote in message news:***@g10g2000cwb.googlegroups.com...
Danny,

thanks for the feedback.
Yes, image processing isn't the center of the world of
color science.
Photoshop isn't the only program (my illustrations were
made by my program Zebra).

For the test description I should have replaced 'side by
side' and 'facing' by 'adjacent': the patches share an
edge.

http://www.fho-emden.de/~hoffmann/swatch16032005.pdf

This doc shows on p.20... planes of constant Lab hue
by 15° steps, like the common Munsell representation.
The patches are separated from each other.
In my impression, the colors for constant L*, which
means constant Y as well, show a visually increasing
brightness with increasing chroma (saturation).

If this can be generalized (as perceived by other
observers) then there is IMO a bug in the CIE concept.

Using CIELab for printer calibration is IMO agreeable.
I had printed posters for an exhibition and established
a lighting system.
The printing as usual for D50 and 500Lux (one standard).
The actual viewing conditions were about 3000K and
min. 50Lux. The posters didn't look wrong.
This says, that it is by no means necessary to reproduce
images colorimetrically correct for actual lighting in
order to achieve a pleasant appearance.

Best regards --Gernot Hoffmann
t***@aim-dtp.net
2006-03-29 05:24:03 UTC
Permalink
Post by Danny Rich
How much more "every day" can a product become
than the clothes we wear or the cars in which we ride?
In reality the industrial use of the CIE dE functions is not to measure
color _difference_. The dE is only used to indicate if two colors are
very similar or not. For that porpose, say dE<1, it sure works,
similarly would work the Euclidean distance in YXZ. When colors are the
same then they are the same.

So the name of the dE functions are incorrect. They are not Color
Difference Functions as they do not give such perceptual result. They
are just go/nogo functions that indicate if the two colors are very
similar or not.

Timo Autiokari
Danny Rich
2006-03-30 23:55:29 UTC
Permalink
Post by t***@aim-dtp.net
Post by Danny Rich
How much more "every day" can a product become
than the clothes we wear or the cars in which we ride?
In reality the industrial use of the CIE dE functions is not to measure
color _difference_. The dE is only used to indicate if two colors are
very similar or not. For that porpose, say dE<1, it sure works,
similarly would work the Euclidean distance in YXZ. When colors are the
same then they are the same.
I must disagree with this analysis. Textile dyeing is an inexact science
and so dyelots are sorted using CIELAB or CMC to assign fabric lots. This
is an absolute use of color difference - even more critical than the graphic
arts applications of judging the colors of visually noisy images.
Post by t***@aim-dtp.net
So the name of the dE functions are incorrect. They are not Color
Difference Functions as they do not give such perceptual result. They
are just go/nogo functions that indicate if the two colors are very
similar or not.
They are perceptual color differences and they are used as such. In
addition, the studies to derive them asked the questions like: Is the
difference between these specimens (No difference, Slight difference,
Moderate difference, Significant difference or Large difference. The
difference between what you are describing and these studies is that for
industrial differences 5 units is a Large difference while the difference
you describe may be 20 or more units. Mike Pointe of the National Physical
Laboratory of the UK has been studying the large differences for some time.
He used to be Kodak Limited in the UK so graphic and image reproduction is
nothing new to him. His work and some earlier work by Alan Robertson of the
National Research Council in Canada seem to show that there is a
physiological switch in the mechanisms of color perception as the size of
the difference increases above 5 CIELAB units. Color difference metrics
derived from large, global differences (discounting the H-K effect) do not
scale down to the color difference threshold and metrics derived from
differences near to the threshold do not scale up to very large differences.
My own studies and those of some of my colleagues show that there appears to
be an additional Webber-Fechner compression on perceptual scales of
lightness and chroma as the size of the differences get very large.

Danny Rich
Post by t***@aim-dtp.net
Timo Autiokari
Timo Autiokari
2006-03-31 15:43:11 UTC
Permalink
Post by Danny Rich
I must disagree with this analysis. Textile dyeing is an inexact science
and so dyelots are sorted using CIELAB or CMC to assign fabric lots. This
is an absolute use of color difference - even more critical than the graphic
arts applications of judging the colors of visually noisy images.
They however search, choose & use that particular dye that is the most
close to the target color. They do not use a dye that is perceptually
notably different by a chosen amount. So still a go/nogo test.
Post by Danny Rich
They are perceptual color differences and they are used as such.
I think everyone (less you) will agree that the examples on my page show
that dE is not perceptually uniform at all.
Post by Danny Rich
The difference between what you are describing and these studies is that for
industrial differences 5 units is a Large difference while the difference
you describe may be 20 or more units.
Not so. I had on the page dE=8, dE=3, dE=2 and dE=1 comparisons. Now I
added the dE=5 comparison. The Photoshop Lab original is also updated.
Post by Danny Rich
Mike Pointe of the National Physical Laboratory of the UK has
been studying the large differences for some time.
He used to be Kodak Limited in the UK so graphic and image reproduction is
nothing new to him. His work and some earlier work by Alan Robertson of the
National Research Council in Canada seem to show that there is a
physiological switch in the mechanisms of color perception as the size of
the difference increases above 5 CIELAB units.
And I show that de5, dE=3, dE=2 and dE=1 are not at all perceptually
uniform differences. Anybody can see that at a glance:
http://www.aim-dtp.net/aim/evaluation/cie_de/index.htm
Post by Danny Rich
Color difference metrics derived from large, global differences
discounting the H-K effect) do not scale down to the color difference
threshold and metrics derived from differences near to the threshold
do not scale up to very large differences.
Let's stay in the smaller differences e.g. <= 5 as you noted, the huge
differences are in the first place very difficult to quantify especially
when the hue is changing a lot.

I show that the dE (none of three functions) are preceptually anywhere
near uniform. Especially when the dE difference is due to the L* alone
all the three formulas fail enormously. dE=5 can be a very good match if
the L* is the same, where dE=1 can be not so good match if it is due to
the L* alone. But then dE=5 can be rather poor match even if the L* is
the same. What practical uses could such "color difference formulas"
possibly have? None. Yes the industy does use these formulas, they have
nothing else, and many people firmly believe in them. Luckily most of
the applications are go/nogo tests with very small dE criteria like dE<1
so these formulas are helpful there (when the two colors are almost
exactly the same). But We really would need a true, accurate, color
difference formula.

Timo Autiokari
Danny Rich
2006-04-01 02:44:11 UTC
Permalink
Post by Timo Autiokari
Post by Danny Rich
I must disagree with this analysis. Textile dyeing is an inexact science
and so dyelots are sorted using CIELAB or CMC to assign fabric lots.
This is an absolute use of color difference - even more critical than the
graphic arts applications of judging the colors of visually noisy images.
They however search, choose & use that particular dye that is the most
close to the target color. They do not use a dye that is perceptually
notably different by a chosen amount. So still a go/nogo test.
Post by Danny Rich
They are perceptual color differences and they are used as such.
This is again not true. This is a standard - true but the lots are shorted
into those that are 1 unit away, 2 units away and so one. Garments may only
be cut from lots that are 3 units away from the target. They cannot afford
to throw that cloth away - unlike printing where the first 2 to 3 hours of
printing are scrapped.
Post by Timo Autiokari
I think everyone (less you) will agree that the examples on my page show
that dE is not perceptually uniform at all.
I never claimed that dE is perceptually uniform - across large distances in
color space. I merely disagreed with your assertion that dE is not useful
for judging color differences. I have also given an explanation why you are
able to show that dE is not visually uniform across the entire range of
visual color space. My own dissertation work demonstrated that the CIELAB
metric is very bad at predicting differences in yellow and yet pretty good
at predicting difference in blues.
Post by Timo Autiokari
Post by Danny Rich
The difference between what you are describing and these studies is that
for industrial differences 5 units is a Large difference while the
difference you describe may be 20 or more units.
Not so. I had on the page dE=8, dE=3, dE=2 and dE=1 comparisons. Now I
added the dE=5 comparison. The Photoshop Lab original is also updated.
Post by Danny Rich
Mike Pointe of the National Physical Laboratory of the UK has been
studying the large differences for some time.
He used to be Kodak Limited in the UK so graphic and image reproduction
is nothing new to him. His work and some earlier work by Alan Robertson
of the National Research Council in Canada seem to show that there is a
physiological switch in the mechanisms of color perception as the size of
the difference increases above 5 CIELAB units.
And I show that de5, dE=3, dE=2 and dE=1 are not at all perceptually
http://www.aim-dtp.net/aim/evaluation/cie_de/index.htm
And the Coil Coaters Association issues a booklet with colored chips mounted
side by side showing differences of CMC dE at 1 and 5 that are visually
uniform. So again - what have we proved? That CIELAB dE is not adequate,
that CMC is better, the CIE94 is about as good and that CIEDE2000 is perhaps
slightly better for acceptance sampling of small but not neglibile
differences. Thus the CIE Technical Committee 1-59 is active and looking at
better ways to map visual judgements of colors and color differences.
Post by Timo Autiokari
Post by Danny Rich
Color difference metrics derived from large, global differences
discounting the H-K effect) do not scale down to the color difference
threshold and metrics derived from differences near to the threshold
do not scale up to very large differences.
Let's stay in the smaller differences e.g. <= 5 as you noted, the huge
differences are in the first place very difficult to quantify especially
when the hue is changing a lot.
I show that the dE (none of three functions) are preceptually anywhere
near uniform. Especially when the dE difference is due to the L* alone all
the three formulas fail enormously. dE=5 can be a very good match if the
L* is the same, where dE=1 can be not so good match if it is due to the L*
alone. But then dE=5 can be rather poor match even if the L* is the same.
What practical uses could such "color difference formulas" possibly have?
None. Yes the industy does use these formulas, they have nothing else, and
many people firmly believe in them. Luckily most of the applications are
go/nogo tests with very small dE criteria like dE<1 so these formulas are
helpful there (when the two colors are almost exactly the same). But We
really would need a true, accurate, color difference formula.
I continue to be fully puzzled by this since every other study that has ever
been reported has shown that L* is the only component that is truly well
mapped and uniform. It agrees quite well with the Munsell scale used by
artists and the scales used in NTSC and PAL b/w broadcast standards. Most
of these studies are not involved in judging acceptability differences but
in scaling perceived lightness of object surfaces viewed against a 19% gray
background and 2000 lux.of simulated daylight of 6500K to 6700K correlated
color temperature.

In your display - referenced above - you have a page of 8 unit differences
in which the only differences perceptable, at least on my monitor, are the
lightness differences. At 8 units - we should not any longer be looking at
color differences but at different colors but only the lightness scales can
be so classified. This is in agreement with my 30 years of experience and
the data in the literature that lightness is well modeled but chromaticness
and hue are not. More importantly - I find that perceived difference
between the center of the block and upper and lower lightness limits are far
more consistent from color center to color center than are the spreads of
the other attributes where sometimes they seem easily perceived and others
they are completely imperceptible.

Danny
Post by Timo Autiokari
Timo Autiokari
Timo Autiokari
2006-04-02 13:54:33 UTC
Permalink
Danny,
Post by Danny Rich
This is again not true. This is a standard
What was it, in your opinion, that "again was not true"? Yes, of course
the dE formulas are standards. But they are incorrect/faulty
standards, they do not deliver what they are said to deliver. They do
not, at all, quantify visual color difference.
Post by Danny Rich
I never claimed that dE is perceptually uniform - across large
distances in color space.
That is so. But here we are discussing only about small color differences.
Post by Danny Rich
I merely disagreed with your assertion that dE
is not useful for judging color differences.
Here you say that we disagree. But at the end of your message you indeed
do agree that dE is not useful for judging (small) color differences.
Post by Danny Rich
And the Coil Coaters Association issues a booklet with colored
chips mounted side by side showing differences of CMC dE at 1
and 5 that are visually uniform.
So now you claim that, instead of the CIE dE functions, it is the CMC
function that is visually uniform???

The CIE(1994) is derived from CMC. Could you please tell me what (l,c)
parameters of the CMC were used in that booklet? I will add the CMC
comparison to the evaluation page as soon as I find my copy of the
original spec (BS:6923 "Method for calculation of small colour
differences"). But it really does not notably differ from the CIE(1994).
Post by Danny Rich
what have we proved? That CIELAB dE is not adequate, that CMC
is better, the CIE94 is about as good and that CIEDE2000 is perhaps
slightly better for acceptance sampling of small but not neglibile
differences.
I have proved that the CIE color difference functions (CIELAB dE, CIE94
and CIEDE2000) are not good at all for any kind of quantification of
visual color difference, at least not on a color-managed monitor screen.
Post by Danny Rich
I continue to be fully puzzled by this since every other study
that has ever been reported has shown that L* is the only
component that is truly well mapped and uniform.
Why is it that you are puzzled? Or do you allude that the evaluation I
have provided is somehow faulty?
Post by Danny Rich
It agrees quite well with the Munsell scale used by artists and
the scales used in NTSC and PAL b/w broadcast standards. Most
of these studies are not involved in judging acceptability
differences but in scaling perceived lightness of object
surfaces viewed against a 19% gray background and 2000 lux.of
simulated daylight of 6500K to 6700K correlated color temperature.
My evaluation has about 19% gray background. A patch on the CRT screen
that is at L*=50 has something like 20cd/m2 luminance. On a similar real
object surface the 2000 lux would translate to something like 130cd/m2
luminance so a rather large difference there. Do you allude that the CIE
color difference formulas fail because of such difference in absolute
luminance?
Post by Danny Rich
In your display - referenced above - you have a page of 8 unit
differences
If you scroll downwards that page:
http://www.aim-dtp.net/aim/evaluation/cie_de/index.htm
you will find that there also are similar evaluations for dE=5, dE=3,
dE=2 and dE=1.
Post by Danny Rich
in which the only differences perceptable, at least on my monitor,
are the lightness differences.
Then there is something very wrong with your monitor/system.
Post by Danny Rich
At 8 units - we should not any longer be looking at color differences
but at different colors
This clearly is not so. dE=8 is rather small color difference in many
cases when that dE=8 is due to a difference in a* and/or b* only (when
the L* is the same).
Post by Danny Rich
This is in agreement with my 30 years of experience and
the data in the literature that lightness is well modeled
but chromaticness and hue are not.
Thank you very much. In other words: The CIE dE functions are not at all
usable functions for the task of quantification of visual color difference.
Post by Danny Rich
More importantly - I find that perceived difference between the center
of the block and upper and lower lightness limits are far more consistent
from color center to color center than are the spreads of the other
attributes where sometimes they seem easily perceived and others they
are completely imperceptible.
In other words, those "lightness" differences are large (enormous, huge)
compared to those differences that are due to a change in a* and/or b*
only. That they are consistently huge, is not a good thing, is not
desirable, does not make the dE functions valid:

A Color Difference Function should reliably quantify
the visual difference between two colors.

Timo Autiokari
Danny Rich
2006-04-05 01:11:48 UTC
Permalink
Post by Gernot Hoffmann
Danny,
Post by Danny Rich
This is again not true. This is a standard
**** Major typographic errors here ****
The sentence should have read: "This is again not true. There is a
standard - true but the lots are sorted
into those that are 1 unit away, 2 units away and so one. Garments may only
be cut from lots that are 3 units away from the target. They cannot afford
to throw that cloth away - unlike printing where the first 2 to 3 hours of
printing are scrapped"

The point being that the textile industry use CMC or CIE94 to acutually
assess absolute color difference and seems to achieve the goal of sorting
fabric into specific lots of absolute differences from the aim point
effectively.
Post by Gernot Hoffmann
What was it, in your opinion, that "again was not true"? Yes, of course
the dE formulas are standards. But they are incorrect/faulty standards,
they do not deliver what they are said to deliver. They do not, at all,
quantify visual color difference.
Post by Danny Rich
I never claimed that dE is perceptually uniform - across large distances
in color space.
That is so. But here we are discussing only about small color differences.
Post by Danny Rich
I merely disagreed with your assertion that dE is not useful for judging
color differences.
Here you say that we disagree. But at the end of your message you indeed
do agree that dE is not useful for judging (small) color differences.
Where did I say that color difference or color tolerance formulas are not
useful? They absolute are useful and used daily. Many psychophysics
studies have been conducted in which the visual differences are ranked or
magnitude scaled and those scales show strong correlations with the
numerical predictions of modern color difference or color tolerances
equations. Are the correlations perfect - no but then if I show the same
set of colors to the same observer a second or third time the observer will
not have perfect correlation with her own judgments. Self generated visual
errors can occur in a much as 40% of the judgments on small to moderate
color differences (<2 CIELAB deltaE units). CIEDE2000 has been reported to
have errors of only 32% to 38% so the equation is about as consistent as
normal human observer.
Post by Gernot Hoffmann
Post by Danny Rich
And the Coil Coaters Association issues a booklet with colored chips
mounted side by side showing differences of CMC dE at 1
and 5 that are visually uniform.
So now you claim that, instead of the CIE dE functions, it is the CMC
function that is visually uniform???
The CIE(1994) is derived from CMC. Could you please tell me what (l,c)
parameters of the CMC were used in that booklet? I will add the CMC
comparison to the evaluation page as soon as I find my copy of the
original spec (BS:6923 "Method for calculation of small colour
differences"). But it really does not notably differ from the CIE(1994).
The CIE94 equation was not derived from CMC. Both are derived from CIELAB
using different sets of observations and optimization criterion. Similarly
derived was the Bradford equation DeltaE(BFD). And these equations should
not be conceptualized as being more visually uniform but as distorting the
spacing of the CIELAB metric to better match the results of visual
judgements of small and moderate color differences.

The are two major differences between CMC and CIE94 and one major
similarity. Both equations use a hyperbolic weighting function to expand
the volume of the visual difference region. This region is a sphere in
CIELAB but is a spheroid - not quite an ellipse - in CMC and CIE94. The
main difference between CIE94 and CMC is that CMC has a hyperbolic weighting
on the lightness difference DeltaL* which expands the spheroid in the
lightness direction as the lightness increases. CIE94 keeps the CIELAB
spacing of lightness because in the studies that were done on smooth
surfaces and CRT images of color patches no lightness non-uniformity could
be detected in the visual observerations. The second area of difference
between the two tolerance equations in the weighting for the metric hue term
DeltaH*. CMC has both a chroma and hue function but CIE94 has only a chroma
function.

The (l,c) terms are additional weights to expand or contract the shape of
the ellipsoid for certain "parametric" effects. Parametric effects include
things like how close together the specimens are placed, how smooth the
surfaces are or whether there are goniochromic additives in the specimen
like metal flakes, pearlescent or fluorescent materials and so forth. For
textiles and plastics with a weave pattern or rough surface it is not
possible to judge lightness differences as effectively as when the surface
is smooth so the axis of the spheroid in the DeltaL* direction is expanded
by an amount (l,) and the a similar additional correction may be added to
the axis in the DeltaC* direction (,c). There is no weight in the DeltaH*
direction since it was known at the time that hue was always the most
critical parameter and its weights should always be 1 so there is an implied
ratio of L to C to H so only the L and C weights are shown as (l,c) or
sometimes as (l:c). New tolerance equations such as CIE94 and CIEDE2000
allow for non-proportional weights on each axis, resulting in a set (l:c:h)
of parametric weights.
Post by Gernot Hoffmann
Post by Danny Rich
what have we proved? That CIELAB dE is not adequate, that CMC is better,
the CIE94 is about as good and that CIEDE2000 is perhaps
slightly better for acceptance sampling of small but not neglibile
differences.
I have proved that the CIE color difference functions (CIELAB dE, CIE94
and CIEDE2000) are not good at all for any kind of quantification of
visual color difference, at least not on a color-managed monitor screen.
Post by Danny Rich
I continue to be fully puzzled by this since every other study that has
ever been reported has shown that L* is the only
component that is truly well mapped and uniform.
Why is it that you are puzzled? Or do you allude that the evaluation I
have provided is somehow faulty?
Post by Danny Rich
It agrees quite well with the Munsell scale used by artists and the
scales used in NTSC and PAL b/w broadcast standards. Most
of these studies are not involved in judging acceptability differences
but in scaling perceived lightness of object
surfaces viewed against a 19% gray background and 2000 lux.of
simulated daylight of 6500K to 6700K correlated color temperature.
My evaluation has about 19% gray background. A patch on the CRT screen
that is at L*=50 has something like 20cd/m2 luminance. On a similar real
object surface the 2000 lux would translate to something like 130cd/m2
luminance so a rather large difference there. Do you allude that the CIE
color difference formulas fail because of such difference in absolute
luminance?
This could be one possible issue. It is known that in the mesopic region -
(luminances less than 50 and more than 5) there is both rod intrusion into
the cone responses as it seems that blue cones and the rods share the visual
pathways to the visual cortex.
Post by Gernot Hoffmann
Post by Danny Rich
In your display - referenced above - you have a page of 8 unit differences
http://www.aim-dtp.net/aim/evaluation/cie_de/index.htm
you will find that there also are similar evaluations for dE=5, dE=3, dE=2
and dE=1.
Post by Danny Rich
in which the only differences perceptable, at least on my monitor, are
the lightness differences.
Then there is something very wrong with your monitor/system.
Post by Danny Rich
At 8 units - we should not any longer be looking at color differences but
at different colors
This clearly is not so. dE=8 is rather small color difference in many
cases when that dE=8 is due to a difference in a* and/or b* only (when the
L* is the same).
The steps between chips in the Munsell or NCS system are about 8 to 10
CIELAB units - I do not call those small differences.
Post by Gernot Hoffmann
Post by Danny Rich
This is in agreement with my 30 years of experience and the data in the
literature that lightness is well modeled but chromaticness and hue are
not.
Thank you very much. In other words: The CIE dE functions are not at all
usable functions for the task of quantification of visual color difference.
Post by Danny Rich
More importantly - I find that perceived difference between the center of
the block and upper and lower lightness limits are far more consistent
from color center to color center than are the spreads of the other
attributes where sometimes they seem easily perceived and others they
are completely imperceptible.
In other words, those "lightness" differences are large (enormous, huge)
compared to those differences that are due to a change in a* and/or b*
only. That they are consistently huge, is not a good thing, is not
A Color Difference Function should reliably quantify
the visual difference between two colors.
Timo Autiokari
Timo Autiokari
2006-04-05 13:22:20 UTC
Permalink
Post by Danny Rich
Where did I say that color difference or color tolerance
formulas are not useful?
You wrote: "lightness is well modeled but chromaticness and hue are
not." To me this is pretty much the same as to say that such Color
Difference Formulas are not useful at all.
Post by Danny Rich
Many psychophysics studies have been conducted in which the
visual differences are ranked or magnitude scaled and those
scales show strong correlations with the numerical predictions
of modern color difference or color tolerances equations.
That is so, psychophysics studies, under a viewing situation that is
_totally different_ from the normal viewing situations where we assess
color differences in our everyday life (paint, textile, graphic art,
imaging etc).
Post by Danny Rich
Do you allude that the CIE color difference formulas
fail because of such difference in absolute luminance?
This could be one possible issue. It is known that in
the mesopic region - (luminances less than 50 and more
than 5) there is both rod intrusion into the cone responses
as it seems that blue cones and the rods share the visual
pathways to the visual cortex.
In what units is that mesopic range??? The most usual definition for the
mesopic range is from 3.4 cd/m2 down to 0.034 cd/m2. On a typical
monitor that has max output at 100 cd/m2 this equals to L* range from
21.6 to 0.3. My evaluation is mostly above L*=50.

I have now updated the page with CMC(2,1) and CMC(1,1) evaluations:
http://www.aim-dtp.net/aim/evaluation/cie_de/index.htm
(press shift-reload in case the CMC charts are not visible).

The CMC(2,1) is ridiculously incorrect. The CMC(1,1) seems to
behave very similarly incorrectly as the CIE00(1,1,1)does:

You said earlier that you are fully puzzled by this so I believe we
do agree that these color difference formulas perform very very
badly in this evaluation.

Are you not, at all, interested in finding the correct answer to this
dilemma?

Timo Autiokari

Graeme Gill
2006-03-28 03:24:41 UTC
Permalink
Post by Gernot Hoffmann
IMO, the problem has its root already in the use of CIE-Y.
Shouldn't we expect that for 256 random samples, sorted by Y,
each next sample will look brighter than the previous ?
http://www.fho-emden.de/~hoffmann/gray10012001.pdf
One reason is the Helmholtz-Kohlrausch effect - saturated
colors appear brighter, for the same Y-value.
I've wondered at times if using my version of CIECAM02 which
includes the Helmholtz-Kohlrausch effect (in Argyll), would produce
better looking grey scale conversions that some other approaches.

Graeme Gill.
t***@aim-dtp.net
2006-03-28 05:41:12 UTC
Permalink
Post by Gernot Hoffmann
IMO, the problem has its root already in the use of CIE-Y.
Shouldn't we expect that for 256 random samples, sorted by Y,
each next sample will look brighter than the previous ?
http://www.fho-emden.de/~hoffmann/gray10012001.pdf
Hello Gernot,

that was an real eye-opener. Now my belief to the CIE system is gone,
or at least shattered strongly. I reproduced your test with 1024
pathces, made a dosen of such charts and indeed the failure is there,
in each of them.
Post by Gernot Hoffmann
One reason is the Helmholtz-Kohlrausch effect
- saturated colors appear brighter, for the same Y-value.
Yes, it is a generally accepted reason/explanation. But deep down there
are the CMFs, could it be that the root cause however is that CMF's in
fact are this much incorrect?
Post by Gernot Hoffmann
Another reason is, as already explained by G.G, that a side
by side presentation will emphasize Y-differences (Mach
band effect, lateral inhibition).
But this leads more to a kind of 'corrugated iron effect'
instead of general brightness shifts, for sufficiently large
patches.
Yes, the edge-effect is not a major error source in this evaluation.

Now I will inspect more with random samples. Thank you very much for
the information!

Timo Autioakri
Loading...