Tuesday, September 27, 2016

Fuzzy Dates

I can remember Fuzzy Logic being quite the thing in the nascent computer based AI projects in the late 70’s and early 80’s. The idea being that you might not have precise values for a given data or that a set of rules might be more or less applicable within a given process – things could be fuzzy but you could still hope to obtain a near optimal output for a given system. If I recall correctly then some of the Expert Systems development work relied upon classes of input that were often akin to broad estimates (at least ranges) rather than precise values.

Anyone who has developed software that deals with scheduling or calendars or some such has run into issues relating to the human cognitive imprecision with dates and times. Often there is an intention to get something done “that evening” or by “Thursday week” and mostly software ends up forcing the user to plump for some measure of precision and carry the fuzziness of that intention in their heads which is suboptimal.

Entering dates to a computer based system has always been a programming challenge. In far off pre-GUI days we built a callable routine for date entry on Vax/VMS that was pretty liberal in what it would accept and attempt to convert into a precise date. Among a number of date formats this also included short-hand like t (today) and t-1 (yesterday) etc.. GUIs have brought us clickable calendar like objects but these can be remarkably clumsy to use and do little to help with the issues around imprecision.

App outputs need to do more to be human sensible. “Tomorrow” or “next week” might communicate a date more effectively to a user than 02/11/2016 (which in any case might be in November or February – you choose). A while back we prototyped some software that converted a date of birth to an age and, within the context of young children, that needed to output values like “3 days old” or “4 weeks” or “9 months” or “4 years” as the date receded into the past and our human expression of age changed accordingly.

The iOS NSDateFormatter has a doesRelativeDateFormatting option that can manage yesterday, today and tomorrow (at least) and Rails has a more extensive distance_of_time_in_words function and I believe that John Resig wrote pretty.js for JavaScript. I do not doubt a great many programmers have knocked up something along these lines over the years. These are all fine but are based upon a specific date or date/time. Being a little less sure about when (“next week” is a nice example) makes things a little more interesting.

The Chrono project in GitHub is a nice introduction to the challenges of parsing a date time string as precise as Sat Aug 17 2013 18:40:39 GMT+0900 or as fuzzy as “last Friday” but that in turn raises the implementation issue of how precisely will a given user supply their date as well as reminding us that in many contexts the time zone might be extremely relevant. There is also some discussion with code samples on StackExchange

Storing an imprecise date or date time is probably best accomplished with a maximum and minimum value. One might consider storing a single value with a precision indicator but I suspect that code would inevitably be required to convert that precision into a range for comparison purposes and so you might as well settle for twin values from get go. There are some interesting StackExchange discussions on this topic as well.

A satisfactory solution on the input side would need to take account of semantics. The shorthand for today plus one week (T+1w perhaps) is not the same as “next week” which is also not the same as “in the next week”. A humanised output is probably simpler but it looks like this would have to be heavily influenced by context so a standardised class might need some alternate output vocabulary options (age and “elapsed time” being two obvious examples).

This is a DRAFT version of a class that will accept a range of strings and convert those in turn into a date range representing a fuzzy date. The class steals happily from some of the sources mentioned above. Input can include things like:


  • Today
  • Tomorrow
  • Yesterday
  • next or Last Month
  • Next or Last Year
  • In the Next n Months
  • In the next n Days
  • During the next week/month/year
  • T(oday) + or – n d(ays)/w(eeks)/m(onths/y(ears)
  • next/last (day of week)
  • next/last (named month)

class FuzzyDate {     #region private declarations     #region static declarations     private static readonly DateTime sqlMinDate = new DateTime(1753, 1, 1); // minimum SQLServer datetime     static List<string> dayList = new List<string>() { "sun", "mon", "tue", "wed", "thu", "fri", "sat" };     static List<string> monthList = new List<string>() { "jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", "sep", "oct", "nov", "dec" };     private static List<IDateTimeInput> parsers = new List<IDateTimeInput>()     {         new RegexDateParser(             @"next +([2-9]\d*) +months",             delegate (Match m)             {                 var val = int.Parse(m.Groups[1].Value);                 return new DateTime[] {DateTime.Now, DateTime.Now.AddMonths(val)};             }),         new RegexDateParser(             @"next +([2-9]\d*) +days",             delegate(Match m)             {                 var val = int.Parse(m.Groups[1].Value);                 return new DateTime[] {DateTime.Now, DateTime.Now.AddDays(val)};             }),         new RegexDateParser(             @"tomorrow",             delegate(Match m)             {                 var dt = DateTime.Today.AddDays(1);                 return new DateTime[] {dt, new DateTime(dt.Year, dt.Month, dt.Day, 23, 59, 59, 999)};             }),         new RegexDateParser(             @"today",             delegate(Match m)             {                 var dt = DateTime.Now;                 return new DateTime[] { dt, new DateTime(dt.Year, dt.Month, dt.Day, 23, 59, 59, 999)};             }),         new RegexDateParser(             @"yesterday",             delegate(Match m)             {                 var dt = DateTime.Today.AddDays(-1);                 return new DateTime[] { dt, new DateTime(dt.Year, dt.Month, dt.Day, 23, 59, 59, 999)};             }),         new RegexDateParser(             @"(in|during) * (last|next) * (year|month|week)",             delegate(Match m)             {                 if(m.Groups[2].Value == "last")                 {                     switch(m.Groups[3].Value)                     {                         case "year":                             var dt = DateTime.Today.AddYears(-1);                             return new DateTime[] {dt, DateTime.Today};                          case "month":                             var dtm = DateTime.Today.AddMonths(-1);                             return new DateTime[] {dtm, DateTime.Today};                          case "week":                              var dtl = DateTime.Today.AddDays(-7);                             return new DateTime[] {dtl, DateTime.Today};                          default:                             return null;                     }                 } else                 {                     switch(m.Groups[3].Value)                     {                         case "year":                             var dt = DateTime.Now.AddYears(1);                             return new DateTime[] {DateTime.Now, dt};                          case "month":                             var dtm = DateTime.Now.AddMonths(1);                             return new DateTime[] {DateTime.Now, dtm};                          case "week":                              var dtl = DateTime.Now;                             return new DateTime[] {dtl, dtl.AddDays(7)};                          default:                             return null;                     }                 }             }),         new RegexDateParser(             @"(last|next) *(year|month|week)",             delegate(Match m)             {                 int val = (m.Groups[1].Value == "last")? -1 :1;                 switch(m.Groups[2].Value)                 {                     case "year":                         var dt = DateTime.Now.AddYears(val);                         return new DateTime[] {new DateTime(dt.Year,1,1), new DateTime(dt.Year,12,31,23,59,59,999)};                      case "month":                         var dtm = DateTime.Now.AddMonths(val);                         return new DateTime[] {startOfMonth(dtm), endOfMonth(dtm)};                      case "week":                         val = (val == 1) ? 7 - (int)DateTime.Today.DayOfWeek : -(7 + (int)DateTime.Today.DayOfWeek);                          var dtl = DateTime.Today.AddDays(val);                         return new DateTime[] {dtl, dtl.AddDays(7).AddSeconds(-1)};                  default:                         return null;                 }             }),         new RegexDateParser(             String.Format(@"(last|next) *({0}).*", String.Join("|", dayList.ToArray())),             delegate(Match m)             {                 var day = m.Groups[2].Value;                 var val = dayList.IndexOf(day.Substring(0,3));                 var adj = (m.Groups[1].Value == "last") ? -1 : 1;                 if(val >= 0)                 {                     val = adj * (val - (int)DateTime.Today.DayOfWeek);                     if(val <= 0) {val += 7; }                     var dt = DateTime.Today.AddDays(val * adj);                     return new DateTime[] {dt, dt.AddDays(1).AddSeconds(-1)};                 } else { return null;}             }),         new RegexDateParser(             String.Format(@"(last|next) *({0}).*", String.Join("|", monthList.ToArray())),             delegate(Match m)             {                 var month = m.Groups[2].Value;                 var val = monthList.IndexOf(month.Substring(0,3));                 var adj = (m.Groups[1].Value == "last") ? -1 : 1;                 if(val >= 0)                 {                     val = adj * (val - ((int)DateTime.Today.Month-1));                     if(val <= 0) {val += 12; }                     var dt = DateTime.Today.AddMonths(val * adj).AddDays(-(DateTime.Today.Day -1));                     return new DateTime[] {dt, dt.AddMonths(1).AddSeconds(-1)};                 } else { return null;}             }),         new RegexDateParser(             @"t(\s)?(\-|\+)(\s)?([1-9]\d*)(\s)?(d|m|y|w)", // t \=/- n d,w,m,y format             delegate(Match m)             {                 string sign = m.Groups[2].Value;                 string mVal = m.Groups[4].Value;                 string mType = m.Groups[6].Value;                 var val = Int32.Parse(mVal);                 if(sign == "-") {val *= -1; }                 DateTime tDay = DateTime.Today;                 switch (mType)                 {                     case "y":                         tDay = tDay.AddYears(val);                         break;                     case "m":                         tDay = tDay.AddMonths(val);                         break;                     case "w":                         tDay = tDay.AddDays(7 * val);                         break;                     case "d":                         tDay = tDay.AddDays(val);                         break;                     default:                         return null;                 }                 return new DateTime[] {tDay, tDay.AddDays(1).AddSeconds(-1)};             })     };     #endregion          private DateTime minDate = sqlMinDate;     private DateTime maxDate = DateTime.MaxValue;     private bool timeSignificant = true; // or false - we shall see     #endregion     #region public constructors and methods     public FuzzyDate()     {     }     public FuzzyDate(DateTime setDate)     {         minDate = maxDate = setDate;     }     public FuzzyDate(string dateString)     {         setDate(dateString.ToLower());     }     public void SetDate(string dateString)     {         setDate(dateString.ToLower());     }     public void SetDate(DateTime dateTime)     {         minDate = maxDate = dateTime;     }     #endregion     #region public properties     public DateTime MinDate     {         get { return minDate; }         set { minDate = value; }     }     public DateTime MaxDate     {         get { return maxDate; }         set { maxDate = value; }     }     public bool TimeSignificant     {         get { return timeSignificant; }         set { timeSignificant = value; }     }     public bool IsFuzzy     {         get { return minDate == maxDate; }     }     #endregion     #region private methods     private void setDate(string dateString)     {         DateTime[] dt;         foreach(var parser in parsers)         {             dt = parser.Parse(dateString);             if(dt != null)             {                 minDate = dt[0];                 maxDate = dt[1];                 break;             }         }     }     private static DateTime startOfMonth(DateTime dayInMonth)     {         return new DateTime(dayInMonth.Year, dayInMonth.Month, 1);     }     private static DateTime endOfMonth(DateTime dayInMonth)     {         var rVal = dayInMonth.AddMonths(1);         rVal = new DateTime(rVal.Year, rVal.Month, 1);         return rVal.AddSeconds(-1);     }     #endregion     #region local interface and class     private interface IDateTimeInput     {         DateTime[] Parse(string dateString);     }     private class RegexDateParser : IDateTimeInput     {         public delegate DateTime[] Interpreter(Match m);         protected Regex regEx;         protected Interpreter interpreter;         public RegexDateParser(string regexString, Interpreter interpreter)         {             regEx = new Regex(regexString);             this.interpreter = interpreter;         }         public DateTime[] Parse(string dateString)         {             var match = regEx.Match(dateString);             if (match.Success)             {                 return interpreter(match);             }             return null;         }     }     #endregion }

The output of human sensible dates based upon a given DateTime is rather simpler with the details being rather dependent upon the context. In my working examples of “elapsed time” and “age” the output is dependent upon the difference between a given date and the current one. This is normally expressed in the .NET environment as a TimeSpan object. However the TimeSpan class lacks a few features that would make life much simpler when dealing with time spans that represent more than just a few days. I have therefore explored the potential for a custom DateTimeSpan class that could probably be collapsed and simplified into a set of extensions for the .NET supplied standard. So please treat this as another draft idea.

public class DateTimeSpan {     private const int WEEK = 7;     private const int MONTH = 30;     private const int YEAR = 365;     private const int MONTHS_IN_YEAR = 12;     private int years;     private int months;     private int weeks;     private int days;     private int hours;     private Int64 minutes;     private Int64 seconds;     private Int64 milliseconds;     public DateTimeSpan(TimeSpan timeSpan)     {         grabValues(timeSpan);         estimateValues();     }     public DateTimeSpan(DateTime datePast)     {         grabValues(DateTime.Now.Subtract(datePast));         calculateValues(datePast, DateTime.Now);     }     public DateTimeSpan(DateTime fromDate, DateTime toDate)     {         grabValues(toDate.Subtract(fromDate));         calculateValues(fromDate, toDate);     }     public int TotalYears { get { return years; } }     public int TotalMonths { get { return months; } }     public int TotalWeeks { get { return weeks; } }     public int TotalDays { get { return days; } }     public int TotalHours { get { return hours; } }     public Int64 TotalMinutes { get { return minutes; } }     public Int64 TotalSeconds { get { return seconds; } }     public Int64 TotalMilliseconds { get { return milliseconds; } }     public bool IsNegative { get { return milliseconds < 0; } }     private void grabValues(TimeSpan timeSpan)     {         milliseconds = (Int64)timeSpan.TotalMilliseconds;         seconds = (Int64)timeSpan.TotalSeconds;         minutes = (Int64)timeSpan.TotalMinutes;         hours = (int)timeSpan.TotalHours;         days = (int)timeSpan.TotalDays;         if (days >= WEEK)         {             weeks = days / WEEK;         }     }     private void estimateValues()     {         if(days >= MONTH)         {             months = days / MONTH;         }         if(days >= YEAR)         {             years = days / YEAR;             months = years * MONTHS_IN_YEAR + ((days - years * YEAR) / MONTH);         }     }     private void calculateValues(DateTime fromDate, DateTime toDate)     {         int sign = 1;         if(fromDate > toDate)         {             var hdate = toDate;             toDate = fromDate;             fromDate = hdate;             sign = -1;         }         int monthCount = -1;         while (fromDate <= toDate)         {             fromDate = fromDate.AddMonths(1);             monthCount++;         }         months = monthCount * sign;         years = monthCount / 12;     } }

Any given output requirement would probably only need one set of rules but just for fun here is a class that can manage two and could be easily expanded to handle more.

public static class PrettyDate {     public enum PrettyTypes     {         [Description("Returns date as an age string")]         Age,         [Description("Returns date/time as elapsed time")]         Elapsed     }     public static string GetPrettyDate(DateTime theDate, PrettyTypes prettyType)     {         DateTimeSpan dSince = new DateTimeSpan(theDate);         if(prettyType == PrettyTypes.Elapsed)         {             if(dSince.IsNegative) { return "not happened yet"; }             if(dSince.TotalYears > 0)             {                 // could be tweaked for a rounded number in a number of ways - this is one                 int months = dSince.TotalMonths - dSince.TotalYears * 12;                 int years = (months >= 10) ? dSince.TotalYears + 1 : dSince.TotalYears;                 return years + " year" + ((years > 1) ? "s ago" : " ago");             } else if (dSince.TotalMonths > 0)             {                 return dSince.TotalMonths + " month" + ((dSince.TotalMonths > 1) ? "s ago" : " ago");             } else if(dSince.TotalWeeks > 0)             {                 return dSince.TotalWeeks + " week" + ((dSince.TotalWeeks > 1) ? "s ago" : " ago");             }             else if(dSince.TotalDays > 0)             {                 return dSince.TotalDays + " day" + ((dSince.TotalDays > 1) ? "s ago" : " ago");             }             else if (dSince.TotalHours > 0)             {                 return dSince.TotalHours + " hour" + ((dSince.TotalHours > 1) ? "s ago" : " ago");             }             else if(dSince.TotalMinutes > 0)             {                 return dSince.TotalMinutes + " minute" + ((dSince.TotalMinutes > 1) ? "s ago" : " ago");             }             else             {                 return "now"; // adjust to taste             }         } else         {             if (dSince.IsNegative) { return "not born yet"; }             if (dSince.TotalYears >= 5)             {                 return dSince.TotalYears + " years";             } else if(dSince.TotalYears >= 3)             {                 int months = dSince.TotalMonths - dSince.TotalYears * 12;                 string fract = "";                 if(months >= 8)                 {                     fract = String.Concat(" ", (char)190);                 } else if(months >= 5)                 {                     fract = String.Concat(" ", (char)189);                 } else if(months >= 2)                 {                     fract = String.Concat(" ", (char)188);                 } else                 {                     fract = " years";                 }                 return dSince.TotalYears + fract;             } else if(dSince.TotalMonths >= 6)             {                 return dSince.TotalMonths + " months";             } else if(dSince.TotalWeeks >= 2)             {                 return dSince.TotalWeeks + " weeks";             } else if(dSince.TotalDays >= 2)             {                 return dSince.TotalDays + " days";             } else             {                 return "newborn";             }         }     } }
It is very likely that any application with a requirement to process dates in the manner suggested here would need substantial code changes to match the working context. I may well return to this topic to explore the usage of dates stored as a range rather than as a single value.

Edited Jan 2017
Added this link to a great piece by Peter-Paul Koch on "input type=date" and why that is (or is not) complicated even when user hostile. Well worth your time.

No comments: