=========================================================================== Hi. If you wish to discuss this document, the appropriate forum is probably www-style@w3.org. Make sure you give the URI of this document if you do bring it up! =========================================================================== ABSTRACT This is a proposal for changes to the selector module draft [1] and user interface module draft [2] which takes into account all the selector suggestions of which I am aware. SUMMARY OF MAIN CHANGES FROM CSS3 DRAFTS + Not exact match simple selector: !ns|element + Substring match attribute selector [ns|attr$="substring"] + Numeric attribute selectors: [ns|attr<=0] [ns|attr>=0] + Regular expression attribute selector: [ns|attr?="regexp"] + Negative attribute selectors: [!...] + Exact match content selector: ="text" + Substring match content selector: $="text" + Regular expression match content selector: ?="text" + Negative content selectors: !... + :replaced pseudo-class + :first-node, :last-node pseudo-classes + :only-child, :only-node, :only-of-type pseudo-classes + :child(n,m), :child-of-type(n,m) pseudo-classes + :children(n,m), :children-of-type(n,m) pseudo-classes + :empty pseudo-class + :matches, :matched pseudo-classes (instead of :selected) + :not-... pseudo-classes + change pseudo-element prefix to "::" from ":" + ::first-word, ::first-words(n) pseudo-elements + ::line(n), ::line(n,m), ::lines(a,b) pseudo-elements + ::last-line pseudo-element + Reference combinator: /.../ COMPLETE PROPOSAL Selectors consist of one or more simple selectors separated from each other by combinators. SIMPLE SELECTORS A simple selector starts either with an optional universal selector or with a type selector, followed by zero or more of the following: attribute selectors, ID selectors, content selectors, pseudo-classes or pseudo-elements. (The universal selector is optional because it may be omitted if there are other parts to the simple selector as well.) The subject of the selector is *always* the elements represented by the last sequence of simple selectors in the selector. TYPE SELECTORS A type selector represents a particular element, optionally in a particular namespace (see the namespace draft [3] for an explanation of how to declare a namespace). In each case the element type may be prefixed by a namespace specifier, which may be either a short name, a wildcard, or empty. The meanings are as follows: ns|GI -- elements with name GI in namespace ns. *|GI -- elements with name GI in any namespace. |GI -- elements with name GI without an explicit namespace. GI -- if no default namespace has been specified, this is equivalent to *|GI. Otherwise it is equivalent to ns|GI where ns is the default namespace. The element name may also be substituted for a wildcard, in which case it is known as the universal selector. ns|* -- all elements in the namespace "ns" *|* -- all elements |* -- all elements without a namespace specifier * -- if no default namespace has been specified, this is equivalent to *|*. Otherwise it is equivalent to ns|* where ns is the default namespace. Not specifying the namespace at all is equivalent to giving the namespace which has been set as the default. If the default is unspecified then it is taken to be "*". EXACT MATCH e.g. ns|GI NOT EXACT MATCH e.g. !ns|GI matches all elements that are _not_ of type "GI" in the "ns" namespace. e.g. !*|* matches no elements at all (and therefore pointless). -- NOTES Element type selectors probably do not need to be any more complicated than the above -- there is in any case a usually small and certainly finite set of element names, set by the DTD, so regexp or substring matches needn't be used to cater for all situations. For example, ?"H[1-6]" ...doesn't make life that much easier than saying H1, H2, H3, H4, H5, H6 ...which, given the HTML DTD, is all it is standing for, and looks a lot neater. ATTRIBUTE SELECTORS An attribute selector matches elements which have the relevant attributes or attribute values. A rich set of methods of matching is required, since attributes come in many different syntaxes. In each case the attribute name may be prefixed by a namespace specifier and the vertical bar, and the namespace specifier may be either a short name, a wildcard, or empty. Not specifying the namespace is equivalent to giving an empty namespace. The meanings are as follows: ns|attr -- performs the test on the "attr" attribute of the "ns" namespace. The element's default attributes are _not_ examined, even if the element itself is in the "ns" namespace. *|attr -- performs the test for every attribute "attr" with a namespace, including the default attribute (i.e., the one without an explicit namespace). |attr -- performs the test on the default attribute of the element (i.e., the attribute named "attr" which does not have an explicit namespace). attr -- exactly equivalent to |attr (default namespaces do not apply to attributes). PRESENCE e.g. [ns|attr] EXACT MATCH e.g. [ns|attr="value"] SUBSTRING MATCH e.g. [ns|attr$="part"] SPACE SEPARATED KEYWORD MATCH e.g. [ns|attr~="keyword"] SPACE SEPARATED KEYWORD MATCH FOR html:class ATTRIBUTE e.g. .keyword HYPHEN SEPARATED ROOT MATCH e.g. [ns|attr|="root"] NUMERIC GREATER-THAN-OR-EQUAL-TO MATCH e.g. [ns|attr>=0] does not match if the attribute is not numeric. NUMERIC LESS-THAN-OR-EQUAL-TO MATCH e.g. [ns|attr<=0] does not match if the attribute is not numeric. REGEXP MATCH e.g. [ns|attr?="regexp"] NOT PRESENCE e.g. [!ns|attr] NOT EXACT MATCH e.g. [!ns|attr="value"] NOT SUBSTRING MATCH e.g. [!ns|attr$="part"] NOT SPACE SEPARATED KEYWORD MATCH e.g. [!ns|attr~="keyword"] NOT HYPHEN SEPARATED ROOT MATCH e.g. [!ns|attr|="root"] NOT NUMERIC GREATER-THAN-OR-EQUAL-TO MATCH e.g. [!ns|attr>=0] does not match if the attribute is not numeric. NOT NUMERIC LESS-THAN-OR-EQUAL-TO MATCH e.g. [!ns|attr<=0] does not match if the attribute is not numeric. NOT REGEXP MATCH e.g. [!ns|attr?="regexp"] -- NOTES Greater than but _not_ equal to can be selected by: [attr>=0][!attr<=0] Similarly, numeric equality is the same as: [attr>=0][attr<=0] Neither of the above will match if the attribute value is missing or non-numeric, however. ID SELECTORS An ID selector matches elements which have a particular ID. By definition, an ID selector can only match at the most one element per document. EXACT MATCH e.g. #id CONTENT SELECTORS A content selector matches elements which have particular content. (The syntax is purposefully the same as that for attribute selectors but without the square brackets.) EXACT MATCH e.g. ="text" SUBSTRING MATCH e.g. $="text" REGEXP MATCH e.g. ?="text" NOT EXACT MATCHq e.g. !="text" NOT SUBSTRING MATCH e.g. !$="text" NOT REGEXP MATCH e.g. !?="text" PSEUDO-CLASSES A pseudo-class matches elements based on information that lies outside of the document tree or that cannot be expressed using the other simple selectors. DYNAMIC PSEUDO-CLASSES LINK PSEUDO-CLASSES :link - matches elements that are links to documents that have not yet been visited. (How this is determined is left up to the UA and is outside the scope of CSS.) :visited - matches elements that are links to documents that have already been visited. (How this is determined is left up to the UA and is outside the scope of CSS.) UI ACTION PSEUDO-CLASSES :hover - matches elements that have the pointing device within their outer border edge. :active - matches elements while they are being activated by the user (only elements whose 'user-input' property has the value of "enabled" can become :active). :focus - matches elements that have the UI focus (only elements whose 'user-input' property has the value of "enabled" can acquire :focus). :enabled - matches elements whose 'user-input' property has the value of "enabled". :disabled - matches elements whose 'user-input' property has the value of "disabled". :checked - matches elements which have been checked or picked (only those elements whose 'user-input' property has the value of "enabled" or "disabled" can be :checked). TARGET PSEUDO-CLASS :target - matches elements whose ID is identical to the fragment identifier of the current URI. REPLACED PSEUDO-CLASS :replaced - matches replaced elements. e.g. elements would match this if the image was found, but if the image is broken/not displayed for whatever reason, and the alt text is shown instead, then it would not match. See footnote [B]. LANGUAGE PSEUDO-CLASS :lang(x) - matches elements in language x. (The language is inherited down the document tree in document-language specific ways. Refer to the relevant specs - HTML, XML - for details.) The language code is matched in the same way as for the |= attribute selector. STRUCTURAL PSEUDO-CLASSES :first-child - same as :child(1). :first-node - same as :first-child, but *ONLY* if there is no #PCDATA anonymous content preceding the first child (ignoring any ignorable whitespace). :first-of-type - same as :child-of-type(1). :last-child - matches elements that have no later siblings. The same as :nth-child(-1). The following will never match: *:last-child ~ * { } :last-node - same as :last-child, but *ONLY* if there is no #PCDATA anonymous content following the last child (ignoring any ignorable whitespace). :last-of-type - matches elements that have no later siblings with the same element name. The following will never match: X:last-of-type ~ X { } :only-child - matches an element that has no siblings. Same as :first-child:last-child or :child(1):child(-1). :only-node - matches an element that has no siblings, not even #PCDATA siblings. Same as :first-node:last-node. :only-of-type - matches an element that has no siblings with the same element name. Same as :first-of-type:last-of-type or :child-of-type(1):child-of-type(-1). :child(n) - directly equivalent to :child(n,0). :child-of-type(n) - directly equivalent to :child-of-type(n,0). :child(n,m) - matches an element that has n+xm-1 siblings before it in the document tree, for all x. (n>=1, m=0 or m>=n, x>=0). In other words, this matches the nth child of an element after all the children have been split into groups of m elements each. For example, this allows the selectors to address every other row in a table, and could be used, for example, to alternate the colour of paragraph text in a cycle of four. TR:child(1,2) /* address every odd row */ TR:child(2,2) /* address every even row */ /* Alternate paragraph colours: */ P:child(1,4) { color: navy; } P:child(2,4) { color: green; } P:child(3,4) { color: maroon; } P:child(4,4) { color: purple; } When m=0, no repeating is used, so :child(5,0) matches only the fifth child. If n is negative, then start counting from the end of the element. For example, /* Alternate paragraph colours: */ P:child(-1,4) { color: navy; } P:child(-2,4) { color: green; } P:child(-3,4) { color: maroon; } P:child(-4,4) { color: purple; } ...results in the same as the previous example, except that the last P of each block is guaranteed to be navy. :child-of-type(n,m) - matches an element that has n+xm-1 siblings with the same element name before it in the document tree, for all x. (n>=1, m=0 or m>=n, x>=0). In other words, this matches the nth child of that type after all the children of that type have been split into groups of m elements each. For example, this allows us to alternate the position of floated images: IMG:child-of-type(1,2) { float: right; } IMG:child-of-type(2,2) { float: left; } When m=0, no repeating is used, so :child-of-type(5,0) matches only the fifth child of that type. If n is negative, then start counting from the end of the element. For example, IMG:child-of-type(-1,2) { float: right; } IMG:child-of-type(-2,2) { float: left; } ...results in the same as the previous example, except that the last IMG of each block is guaranteed to be on the right. :children(a,b) - matches all elements that are the ath child, the bth child, or any child in between, of their parent. Negative numbers mean count from the end of the element. For example, :children(3,5) matches the 3rd, 4th and 5th children of every element. :children-of-type(a,b) - matches all elements of each type that are the ath child of that type, the bth child of that type, or any child of that type in between, of their parent. Negative numbers mean count from the end of the element. For example, TR:children-of-type(2,-1) means all rows apart from the first (and is equivalent to TR:not-first-child-of-type). :empty - matches an element which has no children (including text nodes). :root - matches elements that are the root of their document tree. :matches(SELECTOR) - matches elements if the selector so far with SELECTOR appended to it would match that element or one of its descendants. (See footnote [A].) H1:matches(+P) /* H1s that are followed by paragraphs */ BODY:matches(BLOCKQUOTE P IMG.signature) /* BODY elements that contain one or more BLOCKQUOTEs containing a P containing an IMG element of class "signature". */ P:matches(+ H2) CITE /* matches CITE elements inside P elements which are immediately before an H2 element */ :matched(SELECTOR) - matches elements if the simple selector with SELECTOR _prefixed_ to it would match that element. H2:matched(P+) /* equivalent to P+H2 */ H1 ~ H2:matched(P+) /* equivalent to H1 ~ P + H2 */ A:matched(B):matched(C) /* matches elements A that have both a B ancestor and a C ancestor */ NEGATIVE PSEUDO-CLASSES For consistency, every pseudo-class has an equivalent that matches elements that do _not_ match the positive pseudo-class. For example, :not-hover matches elements that the pointing device is _not_ designating. Similarly, :not-child-of-type(3,7) matches elements that are _not_ the 3rd sibling of groups of 7. For completeness, here is a list of all the negative pseudo-classes: :not-*+* Note that :not-enabled is NOT the same as :disabled, as some (most) elements will be neither enabled nor disabled. PSEUDO-ELEMENTS ::before - the pseudo-element that is just inside every element. ::after - the pseudo-element that is just before the end of every element. ::first-letter - the first letter of every element. ::first-word - the first word of every element. Equivalent to the long form ::first-words(1). The definition of "word" used is that used for the 'text-transform' property. ::first-words(n) - the first n words of every element. The definition of "word" used is that used for the 'text-decoration' property. ::first-line - the root inline box of the first line box of every block element which contains inline elements. Equivalent to the long form ::line(1,0). ::line(n) - the root inline box of the nth line box of every block element which contains inline elements. Equivalent to ::line(n,0). Takes the same properties as ::first-line. ::line(n,m) - the root inline box of the (n+xm)th line box of every block element which contains inline elements, for all x. (n>=1, m=0 or m>=n, x>0). When m=0, no repeating is used, so ::line(5,0) matches only the fifth line box of a block. Takes the same properties as ::first-line. If 'n' is negative, then perform the above calculations but starting from the bottom of the element. So ::line(-2,0) matches the penultimate line, and ::line(-2,2) matches every second line other from the penultimate line and going up the element. ::lines(a,b) - selects every root inline box from the ath to the bth in every block level element which contains inline elements. If either number is negative, then starts counting from the last line of the block. The range may go either forward or backwards. For example, ::lines(1,5) selects the first five lines; ::lines(-2,2) selects every line except the first and last (and demonstrates that a need not be less than b). ::last-line - the root inline box of the last line box of every block element which contains inline elements. Takes the same properties as ::first-line. Directly equivalent to ::line(-1,0). ::selection - applies to the portion of a document that has been highlighted by the user. Only elements whose 'user-select' has a value other than 'none' can have a ::selection. ::access-key - the part of the element which represents the 'key-equivalent' key combination. ::menu - the contents of the element. See the CSS3 UI draft. ::inside, ::outside - see David's proposal [4]. -- NOTES ON PSEUDO-ELEMENT RULES Something needs to be decided about how pseudo-elements inherit from their surroundings, how their contents inherit from them, and also which properties apply to each. For example, the ::selection pseudo-element can actually select straight across boundaries, so ::selection { font-size: 1.5em; } ...would probably not result in the same font-size all across the selection. This should be defined. COMBINATORS DESCENDANT COMBINATORS INDIRECT DESCENDANT COMBINATOR ( ) e.g. A B matches B in: DIRECT DESCENDANT (CHILD) COMBINATOR (>) e.g. A > B matches B in: ADJACENT SIBLING COMBINATORS INDIRECT ADJACENT COMBINATORS (~) e.g. A ~ B matches B in: DIRECT ADJACENT COMBINATORS (+) e.g. A + B matches B in: REFERENCE COMBINATOR (/.../) e.g. IMG /USEMAP/ MAP AREA Matches all AREA elements which are descendants of the MAP element pointed to by the attribute USEMAP of an IMG element. If the attribute is IDREF, then the element selected must be the one which would match #XXX where XXX is the contents of the attribute in question. Otherwise, it is the element pointed to by the URI which the attribute represents, as per the target-counter, target-content, and target-attr functions. DISCUSSION WHY :SELECTED IS A BAD THING... What is :selected ? Is it a pseudo-class or pseudo-element? The answer is 'neither'. It is an entirely new type of selector which simply changes the subject of the selector chain. Thus it is inconsistent with the rest of the selectors draft. (I propose that we use :matches() instead. See footnote [A] below.) ...AND WE SHOULD SEPARATE PSEUDO CLASSES FROM PSEUDO ELEMENTS Secondly, what is :-foo-bar ? Is it a pseudo-class or pseudo-element? The "-foo-" prefix is a mechanism designed by the working group to flag extensions in a forward-compatible way. However, implementations with 'open-ended' engines have no way of knowing which it is supposed to be, pseudo-class or -element, and thus have no easy way of dealing with them internally. Thus I propose that pseudo-elements be changed to use the :: prefix instead, as I have used above. ::before ...and so forth. This will also hopefully make people think more carefully about whether something is a pseudo-class or -element, and they may even realise that things like :selected are neither one nor the other... FOOTNOTES [A] I suggest that in the initial specification of :matches(), this selector must be the last selector in the chain. This makes it directly equivalent to the :selected selector in the CSS3 Selectors WD of August 1999, with the only syntactic differences being the brackets: X:selected A X:matches(A) Then, in later specifications, the :matches() pseudo-class can be extended to allow it to appear anywhere in a selector: X:matches(A) B ...which cannot be done using the :selected pseudo-thing. This means that the initial implementation burden on implementors is no greater for :selected as for :matches, and should thus remove must objections. (See the DISCUSSION section for why :selected is evil and :matches is better...) Also, the :matched() pseudo-class could be combined with the :matches() pseudo class in some way, for example: X:matched(A):matches(B) could be written as: X:matches(A # B) I do not have a view either way. [B] This would probably be defined in terms of the replaced content proposals, such as the "content:replaced(attr(src)), attr(alt)" suggestion or suchlike. REFERENCES [1] CSS3 Selectors Draft, W3C: http://www.w3.org/TR/1999/WD-CSS3-selectors-19990803 [2] CSS3 User Interface Draft, W3C: http://www.w3.org/TR/1999/WD-css3-userint-19990916 [3] CSS3 Namespace Draft, W3C: http://www.w3.org/1999/06/25/WD-css3-namespace-19990625/ [4] ::inside & ::outside, David Baron: http://lists.w3.org/Archives/Public/www-style/2000Mar/0043.html ACKNOWLEDGEMENTS Thanks to: Sjoerd Visscher Bert Bos David Baron -- Ian Hickson