Main Page | Directories | Namespace List | Class Hierarchy | Alphabetical List | Class List | File List | Class Members | File Members | Related Pages | Examples

class.t3lib_parsehtml_proc.php File Reference

Go to the source code of this file.

Namespaces

namespace  TYPO3

Classes

class  t3lib_parsehtml_proc

Functions

 TS_preserve_db ($value)
 Preserve special tags.
 TS_preserve_rte ($value)
 Preserve special tags.
 TS_transform_db ($value, $css=FALSE)
 Transformation handler: 'ts_transform' + 'css_transform' / direction: "db" Cleaning (->db) for standard content elements (ts).
 TS_transform_rte ($value, $css=0)
 Transformation handler: 'ts_transform' + 'css_transform' / direction: "rte" Set (->rte) for standard content elements (ts).
 TS_strip_db ($value)
 Transformation handler: 'ts_strip' / direction: "db" Removing all non-allowed tags.
 getURL ($url)
 Reads the file or url $url and returns the content.
 HTMLcleaner_db ($content, $tagList='')
 Function for cleaning content going into the database.
 getKeepTags ($direction='rte', $tagList='')
 Creates an array of configuration for the HTMLcleaner function based on whether content go TO or FROM the Rich Text Editor ($direction) Unless "tagList" is given, the function will cache the configuration for next time processing goes on.
 divideIntoLines ($value, $count=5, $returnArray=FALSE)
 This resolves the $value into parts based on
-sections and.
 setDivTags ($value, $dT='p')
 Converts all lines into
/.
 internalizeFontTags ($value)
 This splits the $value in font-tag chunks.
 siteUrl ()
 Returns SiteURL based on thisScript.
 rteImageStorageDir ()
 Return the storage folder of RTE image files.
 removeTables ($value, $breakChar='< br/>')
 Remove all tables from incoming code The function is trying to to this is some more or less respectfull way.
 defaultTStagMapping ($code, $direction='rte')
 Default tag mapping for TS.
 getWHFromAttribs ($attribArray)
 Finds width and height from attrib-array If the width and height is found in the style-attribute, use that!
 urlInfoForLinkTags ($url)
 Parse -tag href and return status of email,external,file or page.
 TS_AtagToAbs ($value, $dontSetRTEKEEP=FALSE)
 Converting -tags to absolute URLs (+ setting rtekeep attribute).


Function Documentation

defaultTStagMapping code,
direction = 'rte'
 

Default tag mapping for TS.

Parameters:
string Input code to process
string Direction To databsae (db) or from database to RTE (rte)
Returns:
string Processed value

Definition at line 1376 of file class.t3lib_parsehtml_proc.php.

Referenced by TS_transform_db().

01376                                                          {
01377       if ($direction=='db')   {
01378          $code=$this->mapTags($code,array(   // Map tags
01379             'strong' => 'b',
01380             'em' => 'i'
01381          ));
01382       }
01383       if ($direction=='rte')  {
01384          $code=$this->mapTags($code,array(   // Map tags
01385             'b' => 'strong',
01386             'i' => 'em'
01387          ));
01388       }
01389       return $code;
01390    }

divideIntoLines value,
count = 5,
returnArray = FALSE
 

This resolves the $value into parts based on

-sections and.

-sections and <br />-tags. These are returned as lines separated by chr(10). This point is to resolve the HTML-code returned from RTE into ordinary lines so it's 'human-readable' The function ->setDivTags does the opposite. This function processes content to go into the database.

Parameters:
string Value to process.
integer Recursion brake. Decremented on each recursion down to zero. Default is 5 (which equals the allowed nesting levels of p/div tags).
boolean If true, an array with the lines is returned, otherwise a string of the processed input value.
Returns:
string Processed input value.
See also:
setDivTags()

Definition at line 1137 of file class.t3lib_parsehtml_proc.php.

References HTMLcleaner_db(), and internalizeFontTags().

Referenced by TS_transform_db().

01137                                                                   {
01138 
01139          // Internalize font tags (move them from OUTSIDE p/div to inside it that is the case):
01140       if ($this->procOptions['internalizeFontTags'])  {$value = $this->internalizeFontTags($value);}
01141 
01142          // Setting configuration for processing:
01143       $allowTagsOutside = t3lib_div::trimExplode(',',strtolower($this->procOptions['allowTagsOutside']?$this->procOptions['allowTagsOutside']:'img'),1);
01144       $remapParagraphTag = strtoupper($this->procOptions['remapParagraphTag']);
01145       $divSplit = $this->splitIntoBlock('div,p',$value,1);  // Setting the third param to 1 will eliminate false end-tags. Maybe this is a good thing to do...?
01146 
01147       if ($this->procOptions['keepPDIVattribs'])   {
01148          $keepAttribListArr = t3lib_div::trimExplode(',',strtolower($this->procOptions['keepPDIVattribs']),1);
01149       } else {
01150          $keepAttribListArr = array();
01151       }
01152 
01153          // Returns plainly the value if there was no div/p sections in it
01154       if (count($divSplit)<=1 || $count<=0)  {
01155          return $value;
01156       }
01157 
01158          // Traverse the splitted sections:
01159       foreach($divSplit as $k => $v)   {
01160          if ($k%2)   {  // Inside
01161             $v=$this->removeFirstAndLastTag($v);
01162 
01163                // Fetching 'sub-lines' - which will explode any further p/div nesting...
01164             $subLines = $this->divideIntoLines($v,$count-1,1);
01165             if (is_array($subLines))   {  // So, if there happend to be sub-nesting of p/div, this is written directly as the new content of THIS section. (This would be considered 'an error')
01166                // No noting.
01167             } else { //... but if NO subsection was found, we process it as a TRUE line without erronous content:
01168                $subLines = array($subLines);
01169                if (!$this->procOptions['dontConvBRtoParagraph'])  {  // process break-tags, if configured for. Simply, the breaktags will here be treated like if each was a line of content...
01170                   $subLines = spliti('<br[[:space:]]*[\/]?>',$v);
01171                }
01172 
01173                   // Traverse sublines (there is typically one, except if <br/> has been converted to lines as well!)
01174                reset($subLines);
01175                while(list($sk)=each($subLines)) {
01176 
01177                      // Clear up the subline for DB.
01178                   $subLines[$sk]=$this->HTMLcleaner_db($subLines[$sk]);
01179 
01180                      // Get first tag, attributes etc:
01181                   $fTag = $this->getFirstTag($divSplit[$k]);
01182                   $tagName=strtolower($this->getFirstTagName($divSplit[$k]));
01183                   $attribs=$this->get_tag_attributes($fTag);
01184 
01185                      // Keep attributes (lowercase)
01186                   $newAttribs=array();
01187                   if (count($keepAttribListArr))   {
01188                      foreach($keepAttribListArr as $keepA)  {
01189                         if (isset($attribs[0][$keepA]))  { $newAttribs[$keepA] = $attribs[0][$keepA]; }
01190                      }
01191                   }
01192 
01193                      // ALIGN attribute:
01194                   if (!$this->procOptions['skipAlign'] && strcmp(trim($attribs[0]['align']),'') && strtolower($attribs[0]['align'])!='left') {  // Set to value, but not 'left'
01195                      $newAttribs['align']=strtolower($attribs[0]['align']);
01196                   }
01197 
01198                      // CLASS attribute:
01199                   if (!$this->procOptions['skipClass'] && strcmp(trim($attribs[0]['class']),''))   {  // Set to whatever value
01200                      if (!count($this->allowedClasses) || in_array(strtoupper($attribs[0]['class']),$this->allowedClasses))   {
01201                         $newAttribs['class']=$attribs[0]['class'];
01202                      }
01203                   }
01204 
01205                      // Remove any line break char (10 or 13)
01206                   $subLines[$sk]=ereg_replace(chr(10).'|'.chr(13),'',$subLines[$sk]);
01207 
01208                      // If there are any attributes or if we are supposed to remap the tag, then do so:
01209                   if (count($newAttribs) && strcmp($remapParagraphTag,'1'))      {
01210                      if ($remapParagraphTag=='P')  $tagName='p';
01211                      if ($remapParagraphTag=='DIV')   $tagName='div';
01212                      $subLines[$sk]='<'.trim($tagName.' '.$this->compileTagAttribs($newAttribs)).'>'.$subLines[$sk].'</'.$tagName.'>';
01213                   }
01214                }
01215             }
01216                // Add the processed line(s)
01217             $divSplit[$k] = implode(chr(10),$subLines);
01218 
01219                // If it turns out the line is just blank (containing a &nbsp; possibly) then just make it pure blank:
01220             if (trim(strip_tags($divSplit[$k]))=='&nbsp;')     $divSplit[$k]='';
01221          } else { // outside div:
01222                // Remove positions which are outside div/p tags and without content
01223             $divSplit[$k]=trim(strip_tags($divSplit[$k],'<'.implode('><',$allowTagsOutside).'>'));
01224             if (!strcmp($divSplit[$k],''))   unset($divSplit[$k]);   // Remove part if it's empty
01225          }
01226       }
01227 
01228          // Return value:
01229       return $returnArray ? $divSplit : implode(chr(10),$divSplit);
01230    }

getKeepTags direction = 'rte',
tagList = ''
 

Creates an array of configuration for the HTMLcleaner function based on whether content go TO or FROM the Rich Text Editor ($direction) Unless "tagList" is given, the function will cache the configuration for next time processing goes on.

(In this class that is the case only if we are processing a bulletlist)

Parameters:
string The direction of the content being processed by the output configuration; "db" (content going into the database FROM the rte) or "rte" (content going into the form)
string Comma list of tags to keep (overriding default which is to keep all + take notice of internal configuration)
Returns:
array Configuration array
See also:
HTMLcleaner_db()

Definition at line 1028 of file class.t3lib_parsehtml_proc.php.

Referenced by HTMLcleaner_db(), and setDivTags().

01028                                                       {
01029       if (!is_array($this->getKeepTags_cache[$direction]) || $tagList)  {
01030 
01031             // Setting up allowed tags:
01032          if (strcmp($tagList,''))   {  // If the $tagList input var is set, this will take precedence
01033             $keepTags = array_flip(t3lib_div::trimExplode(',',$tagList,1));
01034          } else { // Default is to get allowed/denied tags from internal array of processing options:
01035                // Construct default list of tags to keep:
01036             $typoScript_list = 'b,i,u,a,img,br,div,center,pre,font,hr,sub,sup,p,strong,em,li,ul,ol,blockquote,strike,span';
01037             $keepTags = array_flip(t3lib_div::trimExplode(',',$typoScript_list.','.strtolower($this->procOptions['allowTags']),1));
01038 
01039                // For tags to deny, remove them from $keepTags array:
01040             $denyTags = t3lib_div::trimExplode(',',$this->procOptions['denyTags'],1);
01041             foreach($denyTags as $dKe) {
01042                unset($keepTags[$dKe]);
01043             }
01044          }
01045 
01046             // Based on the direction of content, set further options:
01047          switch ($direction)  {
01048 
01049                // GOING from database to Rich Text Editor:
01050             case 'rte':
01051                   // Transform bold/italics tags to strong/em
01052                if (isset($keepTags['b'])) {$keepTags['b']=array('remap'=>'STRONG');}
01053                if (isset($keepTags['i'])) {$keepTags['i']=array('remap'=>'EM');}
01054 
01055                   // Transforming keepTags array so it can be understood by the HTMLcleaner function. This basically converts the format of the array from TypoScript (having .'s) to plain multi-dimensional array.
01056                list($keepTags) = $this->HTMLparserConfig($this->procOptions['HTMLparser_rte.'],$keepTags);
01057             break;
01058 
01059                // GOING from RTE to database:
01060             case 'db':
01061                   // Transform strong/em back to bold/italics:
01062                if (isset($keepTags['strong']))  { $keepTags['strong']=array('remap'=>'b'); }
01063                if (isset($keepTags['em']))      { $keepTags['em']=array('remap'=>'i'); }
01064 
01065                   // Setting up span tags if they are allowed:
01066                if (isset($keepTags['span']))    {
01067                   $classes=array_merge(array(''),$this->allowedClasses);
01068                   $keepTags['span']=array(
01069                      'allowedAttribs'=>'class',
01070                      'fixAttrib' => Array(
01071                         'class' => Array (
01072                            'list' => $classes,
01073                            'removeIfFalse' => 1
01074                         )
01075                      ),
01076                      'rmTagIfNoAttrib' => 1
01077                   );
01078                   if (!$this->procOptions['allowedClasses'])   unset($keepTags['span']['fixAttrib']['class']['list']);
01079                }
01080 
01081                   // Setting up font tags if they are allowed:
01082                if (isset($keepTags['font']))    {
01083                   $colors=array_merge(array(''),t3lib_div::trimExplode(',',$this->procOptions['allowedFontColors'],1));
01084                   $keepTags['font']=array(
01085                      'allowedAttribs'=>'face,color,size',
01086                      'fixAttrib' => Array(
01087                         'face' => Array (
01088                            'removeIfFalse' => 1
01089                         ),
01090                         'color' => Array (
01091                            'removeIfFalse' => 1,
01092                            'list'=>$colors
01093                         ),
01094                         'size' => Array (
01095                            'removeIfFalse' => 1,
01096                         )
01097                      ),
01098                      'rmTagIfNoAttrib' => 1
01099                   );
01100                   if (!$this->procOptions['allowedFontColors'])   unset($keepTags['font']['fixAttrib']['color']['list']);
01101                }
01102 
01103                   // Setting further options, getting them from the processiong options:
01104                $TSc = $this->procOptions['HTMLparser_db.'];
01105                if (!$TSc['globalNesting'])   $TSc['globalNesting']='b,i,u,a,center,font,sub,sup,strong,em,strike,span';
01106                if (!$TSc['noAttrib'])  $TSc['noAttrib']='b,i,u,br,center,hr,sub,sup,strong,em,li,ul,ol,blockquote,strike';
01107 
01108                   // Transforming the array from TypoScript to regular array:
01109                list($keepTags) = $this->HTMLparserConfig($TSc,$keepTags);
01110             break;
01111          }
01112 
01113             // Caching (internally, in object memory) the result unless tagList is set:
01114          if (!$tagList) {
01115             $this->getKeepTags_cache[$direction] = $keepTags;
01116          } else {
01117             return $keepTags;
01118          }
01119       }
01120 
01121          // Return result:
01122       return $this->getKeepTags_cache[$direction];
01123    }

getURL url  ) 
 

Reads the file or url $url and returns the content.

Parameters:
string Filepath/URL to read
Returns:
string The content from the resource given as input.
See also:
t3lib_div::getURL()

Definition at line 993 of file class.t3lib_parsehtml_proc.php.

Referenced by t3lib_htmlmail::fetchHTML(), and t3lib_htmlmail::getExtendedURL().

00993                            {
00994       return t3lib_div::getURL($url);
00995    }

getWHFromAttribs attribArray  ) 
 

Finds width and height from attrib-array If the width and height is found in the style-attribute, use that!

Parameters:
array Array of attributes from tag in which to search. More specifically the content of the key "style" is used to extract "width:xxx / height:xxx" information
Returns:
array Integer w/h in key 0/1. Zero is returned if not found.

Definition at line 1399 of file class.t3lib_parsehtml_proc.php.

Referenced by t3lib_parsehtml_proc::TS_images_db().

01399                                              {
01400       $style =trim($attribArray['style']);
01401       if ($style) {
01402          $regex='[[:space:]]*:[[:space:]]*([0-9]*)[[:space:]]*px';
01403             // Width
01404          eregi('width'.$regex,$style,$reg);
01405          $w = intval($reg[1]);
01406             // Height
01407          eregi('height'.$regex,$style,$reg);
01408          $h = intval($reg[1]);
01409       }
01410       if (!$w) {
01411          $w = $attribArray['width'];
01412       }
01413       if (!$h) {
01414          $h = $attribArray['height'];
01415       }
01416       return array(intval($w),intval($h));
01417    }

HTMLcleaner_db content,
tagList = ''
 

Function for cleaning content going into the database.

Content is cleaned eg. by removing unallowed HTML and ds-HSC content It is basically calling HTMLcleaner from the parent class with some preset configuration specifically set up for cleaning content going from the RTE into the db

Parameters:
string Content to clean up
string Comma list of tags to specifically allow. Default comes from getKeepTags and is ""
Returns:
string Clean content
See also:
getKeepTags()

Definition at line 1007 of file class.t3lib_parsehtml_proc.php.

References getKeepTags().

Referenced by divideIntoLines(), and TS_transform_db().

01007                                                    {
01008       if (!$tagList) {
01009          $keepTags = $this->getKeepTags('db');
01010       } else {
01011          $keepTags = $this->getKeepTags('db',$tagList);
01012       }
01013       $kUknown = $this->procOptions['dontRemoveUnknownTags_db'] ? 1 : 0;      // Default: remove unknown tags.
01014       $hSC = $this->procOptions['dontUndoHSC_db'] ? 0 : -1;             // Default: re-convert literals to characters (that is &lt; to <)
01015 
01016       return $this->HTMLcleaner($content,$keepTags,$kUknown,$hSC);
01017     }

internalizeFontTags value  ) 
 

This splits the $value in font-tag chunks.

If there are any

/

sections inside of them, the font-tag is wrapped AROUND the content INSIDE of the P/DIV sections and the outer font-tag is removed. This functions seems to be a good choice for pre-processing content if it has been pasted into the RTE from eg. star-office. In that case the font-tags are normally on the OUTSIDE of the sections. This function is used by eg. divideIntoLines() if the procesing option 'internalizeFontTags' is set.

Parameters:
string Input content
Returns:
string Output content
See also:
divideIntoLines()

Definition at line 1286 of file class.t3lib_parsehtml_proc.php.

Referenced by divideIntoLines().

01286                                           {
01287 
01288          // Splitting into font tag blocks:
01289       $fontSplit = $this->splitIntoBlock('font',$value);
01290 
01291       foreach($fontSplit as $k => $v)  {
01292          if ($k%2)   {  // Inside
01293             $fTag = $this->getFirstTag($v);  // Fint font-tag
01294 
01295             $divSplit_sub = $this->splitIntoBlock('div,p',$this->removeFirstAndLastTag($v),1);
01296             if (count($divSplit_sub)>1)   {  // If there were div/p sections inside the font-tag, do something about it...
01297                   // traverse those sections:
01298                foreach($divSplit_sub as $k2 => $v2)   {
01299                   if ($k2%2)  {  // Inside
01300                      $div_p = $this->getFirstTag($v2);   // Fint font-tag
01301                      $div_p_tagname = $this->getFirstTagName($v2);   // Fint font-tag
01302                      $v2=$this->removeFirstAndLastTag($v2); // ... and remove it from original.
01303                      $divSplit_sub[$k2]=$div_p.$fTag.$v2.'</font>'.'</'.$div_p_tagname.'>';
01304                   } elseif (trim(strip_tags($v2))) {
01305                      $divSplit_sub[$k2]=$fTag.$v2.'</font>';
01306                   }
01307                }
01308                $fontSplit[$k]=implode('',$divSplit_sub);
01309             }
01310          }
01311       }
01312 
01313       return implode('',$fontSplit);
01314    }

removeTables value,
breakChar = '<br />'
 

Remove all tables from incoming code The function is trying to to this is some more or less respectfull way.

The approach is to resolve each table cells content and implode it all by <br /> chars. Thus at least the content is preserved in some way.

Parameters:
string Input value
string Break character to use for linebreaks.
Returns:
string Output value

Definition at line 1344 of file class.t3lib_parsehtml_proc.php.

References table().

01344                                                       {
01345 
01346          // Splitting value into table blocks:
01347       $tableSplit = $this->splitIntoBlock('table',$value);
01348 
01349          // Traverse blocks of tables:
01350       foreach($tableSplit as $k => $v) {
01351          if ($k%2)   {
01352             $tableSplit[$k]='';
01353             $rowSplit = $this->splitIntoBlock('tr',$v);
01354             foreach($rowSplit as $k2 => $v2) {
01355                if ($k2%2)  {
01356                   $cellSplit = $this->getAllParts($this->splitIntoBlock('td',$v2),1,0);
01357                   foreach($cellSplit as $k3 => $v3)   {
01358                      $tableSplit[$k].=$v3.$breakChar;
01359                   }
01360                }
01361             }
01362          }
01363       }
01364 
01365          // Implode it all again:
01366       return implode($breakChar,$tableSplit);
01367    }

rteImageStorageDir  ) 
 

Return the storage folder of RTE image files.

Default is $GLOBALS['TYPO3_CONF_VARS']['BE']['RTE_imageStorageDir'] unless something else is configured in the types configuration for the RTE.

Returns:
string

Definition at line 1332 of file class.t3lib_parsehtml_proc.php.

Referenced by t3lib_parsehtml_proc::TS_images_db().

01332                                  {
01333       return $this->rte_p['imgpath'] ? $this->rte_p['imgpath'] : $GLOBALS['TYPO3_CONF_VARS']['BE']['RTE_imageStorageDir'];
01334    }

setDivTags value,
dT = 'p'
 

Converts all lines into

/.

-sections (unless the line is a div-section already) For processing of content going FROM database TO RTE.

Parameters:
string Value to convert
string Tag to wrap with. Either "p" or "div" should it be. Lowercase preferably.
Returns:
string Processed value.
See also:
divideIntoLines()

Definition at line 1241 of file class.t3lib_parsehtml_proc.php.

References getKeepTags().

Referenced by TS_transform_rte().

01241                                        {
01242 
01243          // First, setting configuration for the HTMLcleaner function. This will process each line between the <div>/<p> section on their way to the RTE
01244       $keepTags = $this->getKeepTags('rte');
01245       $kUknown = $this->procOptions['dontProtectUnknownTags_rte'] ? 0 : 'protect';  // Default: remove unknown tags.
01246       $hSC = $this->procOptions['dontHSC_rte'] ? 0 : 1;  // Default: re-convert literals to characters (that is &lt; to <)
01247       $convNBSP = !$this->procOptions['dontConvAmpInNBSP_rte']?1:0;
01248 
01249          // Divide the content into lines, based on chr(10):
01250       $parts = explode(chr(10),$value);
01251       foreach($parts as $k => $v)   {
01252 
01253             // Processing of line content:
01254          if (!strcmp(trim($parts[$k]),''))   {  // If the line is blank, set it to &nbsp;
01255             $parts[$k]='&nbsp;';
01256          } else { // Clean the line content:
01257             $parts[$k]=$this->HTMLcleaner($parts[$k],$keepTags,$kUknown,$hSC);
01258             if ($convNBSP) $parts[$k]=str_replace('&amp;nbsp;','&nbsp;',$parts[$k]);
01259          }
01260 
01261             // Wrapping the line in <$dT> is not already wrapped:
01262          $testStr = strtolower(trim($parts[$k]));
01263          if (substr($testStr,0,4)!='<div' || substr($testStr,-6)!='</div>')   {
01264             if (substr($testStr,0,2)!='<p' || substr($testStr,-4)!='</p>') {
01265                   // Only set p-tags if there is not already div or p tags:
01266                $parts[$k]='<'.$dT.'>'.$parts[$k].'</'.$dT.'>';
01267             }
01268          }
01269       }
01270 
01271          // Implode result:
01272       return implode(chr(10),$parts);
01273    }

siteUrl  ) 
 

Returns SiteURL based on thisScript.

Returns:
string Value of t3lib_div::getIndpEnv('TYPO3_SITE_URL');
See also:
t3lib_div::getIndpEnv()

Definition at line 1322 of file class.t3lib_parsehtml_proc.php.

Referenced by localFolderTree::SC_browse_links::expandPage(), TS_AtagToAbs(), t3lib_parsehtml_proc::TS_images_db(), t3lib_parsehtml_proc::TS_images_rte(), t3lib_parsehtml_proc::TS_links_db(), t3lib_parsehtml_proc::TS_links_rte(), t3lib_parsehtml_proc::TS_reglinks(), and urlInfoForLinkTags().

01322                         {
01323       return t3lib_div::getIndpEnv('TYPO3_SITE_URL');
01324    }

TS_AtagToAbs value,
dontSetRTEKEEP = FALSE
 

Converting -tags to absolute URLs (+ setting rtekeep attribute).

Parameters:
string Content input
boolean If true, then the "rtekeep" attribute will not be set.
Returns:
string Content output

Definition at line 1484 of file class.t3lib_parsehtml_proc.php.

References siteUrl().

Referenced by t3lib_parsehtml_proc::TS_links_rte(), and t3lib_parsehtml_proc::TS_reglinks().

01484                                                          {
01485       $blockSplit = $this->splitIntoBlock('A',$value);
01486       reset($blockSplit);
01487       while(list($k,$v)=each($blockSplit))   {
01488          if ($k%2)   {  // block:
01489             $attribArray = $this->get_tag_attributes_classic($this->getFirstTag($v),1);
01490 
01491                // Checking if there is a scheme, and if not, prepend the current url.
01492             if (strlen($attribArray['href']))   {  // ONLY do this if href has content - the <a> tag COULD be an anchor and if so, it should be preserved...
01493                $uP = parse_url(strtolower($attribArray['href']));
01494                if (!$uP['scheme'])  {
01495                   $attribArray['href'] = $this->siteUrl().substr($attribArray['href'],strlen($this->relBackPath));
01496                }
01497             } else {
01498                $attribArray['rtekeep'] = 1;
01499             }
01500             if (!$dontSetRTEKEEP)   $attribArray['rtekeep'] = 1;
01501 
01502             $bTag='<a '.t3lib_div::implodeAttributes($attribArray,1).'>';
01503             $eTag='</a>';
01504             $blockSplit[$k] = $bTag.$this->TS_AtagToAbs($this->removeFirstAndLastTag($blockSplit[$k])).$eTag;
01505          }
01506       }
01507       return implode('',$blockSplit);
01508    }

TS_preserve_db value  ) 
 

Preserve special tags.

Parameters:
string Content input
Returns:
string Content output

Definition at line 735 of file class.t3lib_parsehtml_proc.php.

Referenced by t3lib_parsehtml_proc::RTE_transform().

00735                                     {
00736       if (!$this->preserveTags)  return $value;
00737 
00738          // Splitting into blocks for processing (span-tags are used for special tags)
00739       $blockSplit = $this->splitIntoBlock('span',$value);
00740       foreach($blockSplit as $k => $v) {
00741          if ($k%2)   {  // block:
00742             $attribArray=$this->get_tag_attributes_classic($this->getFirstTag($v));
00743             if ($attribArray['specialtag'])  {
00744                $theTag = rawurldecode($attribArray['specialtag']);
00745                $theTagName = $this->getFirstTagName($theTag);
00746                $blockSplit[$k] = $theTag.$this->removeFirstAndLastTag($blockSplit[$k]).'</'.$theTagName.'>';
00747             }
00748          }
00749       }
00750       return implode('',$blockSplit);
00751    }

TS_preserve_rte value  ) 
 

Preserve special tags.

Parameters:
string Content input
Returns:
string Content output

Definition at line 759 of file class.t3lib_parsehtml_proc.php.

Referenced by t3lib_parsehtml_proc::RTE_transform().

00759                                     {
00760       if (!$this->preserveTags)  return $value;
00761 
00762       $blockSplit = $this->splitIntoBlock($this->preserveTags,$value);
00763       foreach($blockSplit as $k => $v) {
00764          if ($k%2)   {  // block:
00765             $blockSplit[$k] = '<span specialtag="'.rawurlencode($this->getFirstTag($v)).'">'.$this->removeFirstAndLastTag($blockSplit[$k]).'</span>';
00766          }
00767       }
00768       return implode('',$blockSplit);
00769    }

TS_strip_db value  ) 
 

Transformation handler: 'ts_strip' / direction: "db" Removing all non-allowed tags.

Parameters:
string Content input
Returns:
string Content output

Definition at line 962 of file class.t3lib_parsehtml_proc.php.

Referenced by t3lib_parsehtml_proc::RTE_transform().

00962                                  {
00963       $value = strip_tags($value,'<'.implode('><',explode(',','b,i,u,a,img,br,div,center,pre,font,hr,sub,sup,p,strong,em,li,ul,ol,blockquote')).'>');
00964       return $value;
00965    }

TS_transform_db value,
css = FALSE
 

Transformation handler: 'ts_transform' + 'css_transform' / direction: "db" Cleaning (->db) for standard content elements (ts).

Parameters:
string Content input
boolean If true, the transformation was "css_transform", otherwise "ts_transform"
Returns:
string Content output
See also:
TS_transform_rte()

Definition at line 780 of file class.t3lib_parsehtml_proc.php.

References defaultTStagMapping(), divideIntoLines(), HTMLcleaner_db(), and table().

Referenced by t3lib_parsehtml_proc::RTE_transform().

00780                                                 {
00781 
00782          // safety... so forever loops are avoided (they should not occur, but an error would potentially do this...)
00783       $this->TS_transform_db_safecounter--;
00784       if ($this->TS_transform_db_safecounter<0) return $value;
00785 
00786          // Split the content from RTE by the occurence of these blocks:
00787       $blockSplit = $this->splitIntoBlock('TABLE,BLOCKQUOTE,'.$this->headListTags,$value);
00788 
00789       $cc=0;
00790       $aC = count($blockSplit);
00791 
00792          // Traverse the blocks
00793       foreach($blockSplit as $k => $v) {
00794          $cc++;
00795          $lastBR = $cc==$aC ? '' : chr(10);
00796 
00797          if ($k%2)   {  // Inside block:
00798 
00799                // Init:
00800             $tag=$this->getFirstTag($v);
00801             $tagName=strtolower($this->getFirstTagName($v));
00802 
00803                // Process based on the tag:
00804             switch($tagName)  {
00805                case 'blockquote':   // Keep blockquotes, but clean the inside recursively in the same manner as the main code
00806                   $blockSplit[$k]='<'.$tagName.'>'.$this->TS_transform_db($this->removeFirstAndLastTag($blockSplit[$k]),$css).'</'.$tagName.'>'.$lastBR;
00807                break;
00808                case 'ol':
00809                case 'ul':  // Transform lists into <typolist>-tags:
00810                   if (!$css)  {
00811                      if (!isset($this->procOptions['typolist']) || $this->procOptions['typolist']) {
00812                         $parts = $this->getAllParts($this->splitIntoBlock('LI',$this->removeFirstAndLastTag($blockSplit[$k])),1,0);
00813                         while(list($k2)=each($parts)) {
00814                            $parts[$k2]=ereg_replace(chr(10).'|'.chr(13),'',$parts[$k2]);  // remove all linesbreaks!
00815                            $parts[$k2]=$this->defaultTStagMapping($parts[$k2],'db');
00816                            $parts[$k2]=$this->cleanFontTags($parts[$k2],0,0,0);
00817                            $parts[$k2] = $this->HTMLcleaner_db($parts[$k2],strtolower($this->procOptions['allowTagsInTypolists']?$this->procOptions['allowTagsInTypolists']:'br,font,b,i,u,a,img,span,strong,em'));
00818                         }
00819                         if ($tagName=='ol')  { $params=' type="1"'; } else { $params=''; }
00820                         $blockSplit[$k]='<typolist'.$params.'>'.chr(10).implode(chr(10),$parts).chr(10).'</typolist>'.$lastBR;
00821                      }
00822                   } else {
00823                      $blockSplit[$k].=$lastBR;
00824                   }
00825                break;
00826                case 'table':  // Tables are NOT allowed in any form (unless preserveTables is set or CSS is the mode)
00827                   if (!$this->procOptions['preserveTables'] && !$css)   {
00828                      $blockSplit[$k]=$this->TS_transform_db($this->removeTables($blockSplit[$k]));
00829                   } else {
00830                      $blockSplit[$k]=str_replace(chr(10),'',$blockSplit[$k]).$lastBR;
00831                   }
00832                break;
00833                case 'h1':
00834                case 'h2':
00835                case 'h3':
00836                case 'h4':
00837                case 'h5':
00838                case 'h6':
00839                   if (!$css)  {
00840                      $attribArray=$this->get_tag_attributes_classic($tag);
00841                         // Processing inner content here:
00842                      $innerContent = $this->HTMLcleaner_db($this->removeFirstAndLastTag($blockSplit[$k]));
00843 
00844                      if (!isset($this->procOptions['typohead']) || $this->procOptions['typohead']) {
00845                         $type = intval(substr($tagName,1));
00846                         $blockSplit[$k]='<typohead'.
00847                                     ($type!=6?' type="'.$type.'"':'').
00848                                     ($attribArray['align']?' align="'.$attribArray['align'].'"':'').
00849                                     ($attribArray['class']?' class="'.$attribArray['class'].'"':'').
00850                                     '>'.
00851                                     $innerContent.
00852                                     '</typohead>'.
00853                                     $lastBR;
00854                      } else {
00855                         $blockSplit[$k]='<'.$tagName.
00856                                     ($attribArray['align']?' align="'.htmlspecialchars($attribArray['align']).'"':'').
00857                                     ($attribArray['class']?' class="'.htmlspecialchars($attribArray['class']).'"':'').
00858                                     '>'.
00859                                     $innerContent.
00860                                     '</'.$tagName.'>'.
00861                                     $lastBR;
00862                      }
00863                   } else {
00864                      $blockSplit[$k].=$lastBR;
00865                   }
00866                break;
00867                default:
00868                   $blockSplit[$k].=$lastBR;
00869                break;
00870             }
00871          } else { // NON-block:
00872             if (strcmp(trim($blockSplit[$k]),''))  {
00873                $blockSplit[$k]=$this->divideIntoLines($blockSplit[$k]).$lastBR;
00874             } else unset($blockSplit[$k]);
00875          }
00876       }
00877       $this->TS_transform_db_safecounter++;
00878 
00879       return implode('',$blockSplit);
00880    }

TS_transform_rte value,
css = 0
 

Transformation handler: 'ts_transform' + 'css_transform' / direction: "rte" Set (->rte) for standard content elements (ts).

Parameters:
string Content input
boolean If true, the transformation was "css_transform", otherwise "ts_transform"
Returns:
string Content output
See also:
TS_transform_db()

Definition at line 891 of file class.t3lib_parsehtml_proc.php.

References setDivTags().

Referenced by t3lib_parsehtml_proc::RTE_transform().

00891                                              {
00892 
00893          // Split the content from Database by the occurence of these blocks:
00894       $blockSplit = $this->splitIntoBlock('TABLE,BLOCKQUOTE,TYPOLIST,TYPOHEAD,'.$this->headListTags,$value);
00895 
00896          // Traverse the blocks
00897       foreach($blockSplit as $k => $v) {
00898          if ($k%2)   {  // Inside one of the blocks:
00899 
00900                // Init:
00901             $tag = $this->getFirstTag($v);
00902             $tagName = strtolower($this->getFirstTagName($v));
00903             $attribArray = $this->get_tag_attributes_classic($tag);
00904 
00905                // Based on tagname, we do transformations:
00906             switch($tagName)  {
00907                case 'blockquote':   // Keep blockquotes:
00908                   $blockSplit[$k] = $tag.
00909                                  $this->TS_transform_rte($this->removeFirstAndLastTag($blockSplit[$k]),$css).
00910                                  '</'.$tagName.'>';
00911                break;
00912                case 'typolist':  // Transform typolist blocks into OL/UL lists. Type 1 is expected to be numerical block
00913                   if (!isset($this->procOptions['typolist']) || $this->procOptions['typolist']) {
00914                      $tListContent = $this->removeFirstAndLastTag($blockSplit[$k]);
00915                      $tListContent = ereg_replace('^[ ]*'.chr(10),'',$tListContent);
00916                      $tListContent = ereg_replace(chr(10).'[ ]*$','',$tListContent);
00917                      $lines = explode(chr(10),$tListContent);
00918                      $typ = $attribArray['type']==1 ? 'ol' : 'ul';
00919                      $blockSplit[$k] = '<'.$typ.'>'.chr(10).
00920                                     '<li>'.implode('</li>'.chr(10).'<li>',$lines).'</li>'.
00921                                     '</'.$typ.'>';
00922                   }
00923                break;
00924                case 'typohead':  // Transform typohead into Hx tags.
00925                   if (!isset($this->procOptions['typohead']) || $this->procOptions['typohead']) {
00926                      $tC = $this->removeFirstAndLastTag($blockSplit[$k]);
00927                      $typ = t3lib_div::intInRange($attribArray['type'],0,6);
00928                      if (!$typ)  $typ=6;
00929                      $align = $attribArray['align']?' align="'.$attribArray['align'].'"': '';
00930                      $class = $attribArray['class']?' class="'.$attribArray['class'].'"': '';
00931                      $blockSplit[$k] = '<h'.$typ.$align.$class.'>'.
00932                                     $tC.
00933                                     '</h'.$typ.'>';
00934                   }
00935                break;
00936             }
00937             $blockSplit[$k+1] = ereg_replace('^[ ]*'.chr(10),'',$blockSplit[$k+1]); // Removing linebreak if typohead
00938          } else { // NON-block:
00939             $nextFTN = $this->getFirstTagName($blockSplit[$k+1]);
00940             $singleLineBreak = $blockSplit[$k]==chr(10);
00941             if (t3lib_div::inList('TABLE,BLOCKQUOTE,TYPOLIST,TYPOHEAD,'.$this->headListTags,$nextFTN))   {  // Removing linebreak if typolist/typohead
00942                $blockSplit[$k] = ereg_replace(chr(10).'[ ]*$','',$blockSplit[$k]);
00943             }
00944                // If $blockSplit[$k] is blank then unset the line. UNLESS the line happend to be a single line break.
00945             if (!strcmp($blockSplit[$k],'') && !$singleLineBreak) {
00946                unset($blockSplit[$k]);
00947             } else {
00948                $blockSplit[$k] = $this->setDivTags($blockSplit[$k],($this->procOptions['useDIVasParagraphTagForRTE']?'div':'p'));
00949             }
00950          }
00951       }
00952       return implode(chr(10),$blockSplit);
00953    }

urlInfoForLinkTags url  ) 
 

Parse -tag href and return status of email,external,file or page.

Parameters:
string URL to analyse.
Returns:
array Information in an array about the URL

Definition at line 1425 of file class.t3lib_parsehtml_proc.php.

References $a, siteUrl(), and TYPO3_mainDir.

Referenced by t3lib_parsehtml_proc::TS_links_db().

01425                                        {
01426       $info = array();
01427       $url = trim($url);
01428       if (substr(strtolower($url),0,7)=='mailto:') {
01429          $info['url']=trim(substr($url,7));
01430          $info['type']='email';
01431       } else {
01432          $curURL = $this->siteUrl();   // 100502, removed this: 'http://'.t3lib_div::getThisUrl(); Reason: The url returned had typo3/ in the end - should be only the site's url as far as I see...
01433          for($a=0;$a<strlen($url);$a++)   {
01434             if ($url[$a]!=$curURL[$a]) {
01435                break;
01436             }
01437          }
01438 
01439          $info['relScriptPath']=substr($curURL,$a);
01440          $info['relUrl']=substr($url,$a);
01441          $info['url']=$url;
01442          $info['type']='ext';
01443 
01444          $siteUrl_parts = parse_url($url);
01445          $curUrl_parts = parse_url($curURL);
01446 
01447          if ($siteUrl_parts['host']==$curUrl_parts['host']  // Hosts should match
01448             && (!$info['relScriptPath']   || (defined('TYPO3_mainDir') && substr($info['relScriptPath'],0,strlen(TYPO3_mainDir))==TYPO3_mainDir))) {  // If the script path seems to match or is empty (FE-EDIT)
01449 
01450                // New processing order 100502
01451             $uP=parse_url($info['relUrl']);
01452 
01453             if (!strcmp('#'.$siteUrl_parts['fragment'],$info['relUrl'])) {
01454                $info['url']=$info['relUrl'];
01455                $info['type']='anchor';
01456             } elseif (!trim($uP['path']) || !strcmp($uP['path'],'index.php')) {
01457                $pp = explode('id=',$uP['query']);
01458                $id = trim($pp[1]);
01459                if ($id) {
01460                   $info['pageid']=$id;
01461                   $info['cElement']=$uP['fragment'];
01462                   $info['url']=$id.($info['cElement']?'#'.$info['cElement']:'');
01463                   $info['type']='page';
01464                }
01465             } else {
01466                $info['url']=$info['relUrl'];
01467                $info['type']='file';
01468             }
01469          } else {
01470             unset($info['relScriptPath']);
01471             unset($info['relUrl']);
01472          }
01473       }
01474       return $info;
01475    }


Generated on Sun Oct 3 01:06:02 2004 for TYPO3core 3.7.0 dev by  doxygen 1.3.8-20040913